Commit Graph

56374 Commits

Author SHA1 Message Date
Stanislau Hlebik
b905d1399c mononoke: change error of log messages
Summary:
Previously we could have "Started ..." before "Starting ..."
This diff fixes it.

Reviewed By: krallin

Differential Revision: D20277406

fbshipit-source-id: 3c2f3fa1723c2e0852c6b114592ab7ad90be17ff
2020-03-06 10:24:24 -08:00
svcscm
0ae16d2266 Updating submodules
Summary:
GitHub commits:

6f3d8f43d9

Reviewed By: yns88

fbshipit-source-id: 1131ffb085118035a8c918841cbc84abee4b6826
2020-03-06 10:17:28 -08:00
Stanislau Hlebik
9df1cc0028 remotefilelog: prefetch files in chunks
Summary:
We saw >10 timeouts on Mononoke side from people who fetch a lot of files from corp network (https://fburl.com/scuba/mononoke_test_perf/qd15c5f1). They all tried to download a lot of files, and they 90 mins timeout limit.

Let's try to download these files in rather large chunks (e.g. 200000 files). That should help with resumability and prevent timeout errors, and make retries smaller.

It adds an overhead for establishing a new connection after every 200000 files. That shouldn't be a big problem, and even if it's a big problem we can just increase remotefilelog.prefetchchunksize.

Reviewed By: xavierd

Differential Revision: D20286499

fbshipit-source-id: 316c2e07a0856731629447627ae65c094c6eecc5
2020-03-06 10:06:06 -08:00
Katie Mancini
b0a5ffb100 Expose number of imports queued
Summary: expose the counters for number of pending imports (blobs, trees, prefetches) to allow use in tooling

Reviewed By: chadaustin

Differential Revision: D20269853

fbshipit-source-id: d2b7e2110520290751699c4a891d41ebd5b374cf
2020-03-06 09:28:55 -08:00
Xavier Deguillard
b43f9b8d14 remotefilelog: retry fetches on dropped connections
Summary:
On bad network link (such as on VPN), the reliability of the connection to
Mercurial might be fairly flaky. Instead of failing fetching files, let's retry
a bit first, in the hope that the connection will be back by then.

Reviewed By: quark-zju

Differential Revision: D20295255

fbshipit-source-id: d3038f5e4718b521ae4c3f2844f869a04ebb25e3
2020-03-06 08:32:16 -08:00
Jun Wu
3103fcf62b indexedlog: reload content after obtaining a lock at open time
Summary:
The old code does "read, lock, write", which is unsound because after "lock"
the data just read can be outdated and needs a reload.

Reviewed By: xavierd

Differential Revision: D20306137

fbshipit-source-id: a1c29d5078b2d47ee95cf00db8c1fcbe3447cccf
2020-03-06 08:12:02 -08:00
Thomas Orozco
9493a05e7b mononoke/filestore: update store_bytes to chunk content
Summary:
This updates the store_bytes method to chunk incoming data instead of uploading
it as-is. This is unfortunately a bit hacky (but so was the previous
implementation), since it means we have to hash the data before it has gone
through the Filestore's preparation.

That said, one of the invariants of the filestore is that chunk size shouldn't
affect the Content ID (and there is fairly extensive test coverage for this),
so, notionally, this does work.

Performance-wise, it does mean we are hashing the object twice. That actually
was the case before as well anyway (since obtain the ContentId for FileContents
would clone them then hash them).

The upshot of this change is that large files uploaded through unbundle will
actually be chunked (whereas before, they wouldn't be).

Long-term, we should try and delete this method, as it is quite unsavory to
begin with. But, for now, we don't really have a choice since our content
upload path does rely on its existence.

Reviewed By: StanislavGlebik

Differential Revision: D20281937

fbshipit-source-id: 78d584b2f9eea6996dd1d4acbbadc10c9049a408
2020-03-06 07:43:07 -08:00
Thomas Orozco
56a7ce8697 mononoke/filestore: make FilestoreConfig Copy and pass it by value
Summary:
This is a very small struct (2 u64s) that really doesn't need to be passed by
reference. Might as well just pass it by value.

Differential Revision: D20281936

fbshipit-source-id: 2cc64c8ab6e99ee50b2e493eff61ea34d6eb54c1
2020-03-06 02:00:23 -08:00
Lukas Piatkowski
bdb3b625d1 blobstore: cover more blobstores to make them OSS buildable
Reviewed By: farnz

Differential Revision: D20221288

fbshipit-source-id: 708be6d429e673dcb4201b88541dff2bf9fca153
2020-03-06 01:33:38 -08:00
Lukas Piatkowski
7ddcdd818c mononoke: make sql_ext OSS buildable
Summary: separate out the Facebook-specific pieces of the sql_ext crate

Reviewed By: ahornby

Differential Revision: D20218219

fbshipit-source-id: e933c7402b31fcd5c4af78d5e70adafd67e91ecd
2020-03-06 01:33:38 -08:00
svcscm
cda098386f Updating submodules
Summary:
GitHub commits:

11c3570936
1bab211d77

Reviewed By: yns88

fbshipit-source-id: 23ba153d0a2b8217552f1a657167f5543ca498be
2020-03-06 01:33:37 -08:00
svcscm
62c61dd58e Updating submodules
Summary:
GitHub commits:

48d2547744
0b4e86c218
6a544e6fc7
e171a219d5
5b2efc3c29

Reviewed By: yns88

fbshipit-source-id: 76f20c7ae811ffcf6614f8f8dc02c9bfa88384d5
2020-03-05 18:37:23 -08:00
Durham Goode
a7e62ec3de hggit: add ability to skip commits using hggit.skipgithashes
Summary:
We've had cases where a git commit goes in that shouldn't be translated
to Mercurial. Let's add an option to skip the commit. Instead of skipping it
entirely (which would require complicated logic to then parent the following
commit on the last converted commit), let's just convert the skipped commit as
an empty commit. This should cover the cases we've encountered so far.

Reviewed By: krallin

Differential Revision: D20261743

fbshipit-source-id: da401863b09c2ac727aae1ceef10a0e8d8f98a7e
2020-03-05 15:06:30 -08:00
David Tolnay
754a755eee rust: Rename tokio_preview:: to tokio::
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/

This codemod replaces all dependencies on `//common/rust/renamed:tokio-preview` with `fbsource//third-party/rust:tokio-preview` and their uses in Rust code from `tokio_preview::` to `tokio::`.

This does not introduce any collisions with `tokio::` meaning 0.1 tokio because D20235404 previously renamed all of those to `tokio_old::` in crates that depend on both 0.1 and 0.2 tokio.

This is the tokio version of what D20213432 did for futures.

Codemod performed by:

```
rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
    -x \
    buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:tokio-preview, 1))" \
| xargs sed -i 's,\btokio_preview::,tokio::,'

rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| xargs sed -i 's,//common/rust/renamed:tokio-preview,fbsource//third-party/rust:tokio-preview,'
```

Reviewed By: k21

Differential Revision: D20236557

fbshipit-source-id: 15068b93a0a944d6249a1d9f63840a4c61c9c1ba
2020-03-05 14:25:10 -08:00
svcscm
a22c4becec Updating submodules
Summary:
GitHub commits:

db49eaf301
40f62d20c5
22d4322fd5
df865c4e34

Reviewed By: yns88

fbshipit-source-id: e13888e95abb9303faebf2b2eaa2f241e13b3fc5
2020-03-05 13:52:59 -08:00
Jun Wu
75e4ffc17f indexedlog: change IndexDef.lag_threshold back from entries to bytes
Summary:
I thought the index function could be the bottleneck. However, the Log reading
(xxhash, decoding vlqs) can be much slower for very long entries. Therefore
using bytes as the lag threshold is better. It does leaked the Log
implementation details (how it encodes an entry) to some extend, though.

Reverts D20042045 and D20043116 logically. The lagging calculation is using
the new Index::get_original_meta API, which is easier to verify correctness
(In fact, it seems the old code is wrong - it might skip Index flushes if
sync() is called multiple times without flushing).

This should mitigate an issue where a huge entry (generated by `hg trace`) in
blackbox does not get indexed in time and cause performance regressions.

Reviewed By: DurhamG

Differential Revision: D20286508

fbshipit-source-id: 7cd694b58b95537490047fb1834c16b30d102f18
2020-03-05 13:29:48 -08:00
Jun Wu
efff6f3592 indexedlog: add an API to get the Index meta that is not dirty
Summary: This will be used to more reliably detect index lags.

Reviewed By: DurhamG

Differential Revision: D20286518

fbshipit-source-id: c553b6587363a55603b75df12580588e3100e35f
2020-03-05 13:29:47 -08:00
Jun Wu
66e60bacb9 rotatelog: build indexes for older logs on access
Summary:
This ensures indexes are complete even if index format or definition has been
changed.

Reviewed By: DurhamG

Differential Revision: D20286509

fbshipit-source-id: fcc4ebc616a4501e4b6fd2f1a9826f54f40b99b8
2020-03-05 13:29:47 -08:00
Jun Wu
669c58bd56 blackbox: use RotateLog::iter_dirty()
Summary:
This avoids loading all blackbox logs when `init()` gets called multiple times
(for example, once in Rust and once in Python).

Reviewed By: DurhamG

Differential Revision: D20286511

fbshipit-source-id: ef985e454782b787feac90a6249651a882b6552e
2020-03-05 13:29:47 -08:00
Jun Wu
1c6310b9d6 rotatelog: add iter_dirty() API
Summary: This API has the benefit that it does not trigger loading older logs.

Reviewed By: DurhamG

Differential Revision: D20286512

fbshipit-source-id: 426421691ad1130cdbb2305612d76f18c9f8798c
2020-03-05 13:29:46 -08:00
Thomas Orozco
3ee98c82e2 mononoke/microwave: add support for changesets
Summary:
This updates microwave to also support changesets, in addition to filenodes.
Those create a non-trivial amount of SQL load when we warm up the cache (due to
sequential reads), which we can eliminate by loading them through microwave.

They're also a bottleneck when manifests are loaded already.

Note: as part of this, I've updated the Microwave wrapper methods to panic if
we try to access a method that isn't instrumented. Since we'd be running
the Microwave builder in the background, this feels OK (because then we'd find
out if we call them during cache warmup unexpectedly).

Reviewed By: farnz

Differential Revision: D20221463

fbshipit-source-id: 317023677af4180007001fcaccc203681b7c95b7
2020-03-05 11:57:43 -08:00
Thomas Orozco
dd38f1fdb2 mononoke/cache_warmup: conditionally use microwave for faster warmup
Summary:
This incorporates microwave into the cache warmup process. See earlier in this
stack for a description of what this does, how it works, and why it's useful.

Reviewed By: ahornby

Differential Revision: D20219904

fbshipit-source-id: 52db74dc83635c5673ffe97cd5ff3e06faba7621
2020-03-05 11:57:43 -08:00
Jun Wu
64ba669a51 nameset: add some tests for DagSet
Summary:
With the new crate-public interfaces and Debug implementations it's possible to
write tests for DagSet. So let's do it.

Reviewed By: sfilipco

Differential Revision: D20242561

fbshipit-source-id: 180e04d9535f79471c79c4307f6ab6e8e8815067
2020-03-05 11:46:18 -08:00
Jun Wu
3e80ba4f99 repo: skip data migrations if repo lock cannot be taken
Summary: This avoids deadlock with edenfs-triggered debugimporthelper.

Reviewed By: simpkins

Differential Revision: D20270678

fbshipit-source-id: 6d3e7664b375d10ad2a8caeecaef5fa895264472
2020-03-05 11:42:19 -08:00
svcscm
9e87cf0897 Updating submodules
Summary:
GitHub commits:

75f9c85612
1a8c6ed94b
bbb8cb8218

Reviewed By: yns88

fbshipit-source-id: ffb4829bfae7427605923d5ee1fffc35d8436c80
2020-03-05 11:42:19 -08:00
svcscm
b91f34f77a Updating submodules
Summary:
GitHub commits:

a12753e479
d72b38e4aa

Reviewed By: yns88

fbshipit-source-id: 186a54f8db7f34ad34458e14a353bfa1a7811e3a
2020-03-05 09:42:45 -08:00
Xavier Deguillard
ef70d9eb08 scratch: silence warnings
Summary: The compiler was complaining about these on Windows.

Reviewed By: quark-zju

Differential Revision: D20250719

fbshipit-source-id: 89405e155875a4a549b243e93ce63cf3f53b1fab
2020-03-05 09:35:58 -08:00
Xavier Deguillard
34bce8690f revisionstore: silence compiler warning
Summary:
Don't restrict constructing a c_api datapack store to only Unix, we can
construct it on Windows too by assuming that their path will be valid UTF-8.

Reviewed By: quark-zju

Differential Revision: D20250718

fbshipit-source-id: 07234b6a71b50c803cfe3b962fa727f57037c919
2020-03-05 09:35:57 -08:00
Xavier Deguillard
751fc53638 types: add an ancestors method to RepoPath
Summary: This returns the ancestors in the reverser order as the parents method.

Reviewed By: sfilipco

Differential Revision: D20265277

fbshipit-source-id: 83277cee3d8e9070fc56d20d4c1877e6782c22f7
2020-03-05 09:31:32 -08:00
Katie Mancini
4f0c4a1b04 Track number of imports queued
Summary:
adds a counter to track the imports queued to enable more statistics exposure.

- Add a counters to track the number of blob, tree, prefetch  imports that are in the pending

- have the counters increment (increment in constructor of wrapper struct) when the import is about to be queued

- have counters decrement once the load has completed (decrement in destructor of wrapper struct)

Reviewed By: chadaustin

Differential Revision: D20256410

fbshipit-source-id: 5536b46307b30fc19dc5747414727a86961c78e1
2020-03-05 09:03:06 -08:00
Pavel Aslanov
95bf3a32a4 Report bytes sent via perf counters for stream_out_shallow command
Summary: Report bytes sent via perf counters for `stream_out_shallow` command

Reviewed By: krallin

Differential Revision: D20283114

fbshipit-source-id: 1f354904c68322b941ff0c035bb0b811e41e74a1
2020-03-05 08:58:21 -08:00
Liubov Dmitrieva
bb2f81e26b mononoke_api: improve algo for stack calculation
Summary: Improvements aim to minimize number of db queries

Differential Revision: D20280711

fbshipit-source-id: 6cc06f1ac4ed8db9978e0eee956550fcd16bbe8a
2020-03-05 08:31:37 -08:00
Aida Getoeva
db19504972 mononoke: derive changeset info
Summary:
Implementation of derivation logic for the changeset info.

BonsaiDerived is implemented for the ChangesetInfo. `derive_from_parents` just derives an info and BonsaiDerivedMapping then puts it into the blobstore.

```
ChangesetInfo::derive(..) -> ChacgesetInfo
```

Reviewed By: krallin

Differential Revision: D20185954

fbshipit-source-id: afe609d1b2711aed7f2740714df6b9417c6fe716
2020-03-05 08:24:38 -08:00
Aida Getoeva
09b03ce1bf mononoke: derived changeset info - data structures
Summary:
Introducing data structures for derived Bonsai changeset info, which is supposed to store all commit metadata except of the file changes.

Bonsai changeset consists of the commit metadata and a set of all the file changes associated with the commit.
Some of the changesets, usually for merge commits, include thousands of file changes. It is not a problem by itself, however in cases where we need to know some information about the commit apart from its hash, we have to fetch the whole changeset. And it can take up to 15-20 seconds

Changeset info as a separate data structure is needed to speed up changeset fetching process: when we need to use commit metadata but not the file changes.

Reviewed By: markbt

Differential Revision: D20139434

fbshipit-source-id: 4faab267304d987b44d56994af9e36b6efabe02a
2020-03-05 08:24:38 -08:00
svcscm
c005f8a87e Updating submodules
Summary:
GitHub commits:

f812043283

Reviewed By: yns88

fbshipit-source-id: 92ce2d487ec70fe1b7b6cae0d224b4c085a6a4ef
2020-03-05 08:24:37 -08:00
Jun Wu
7c9e74aa09 pytracing: make ascii() return bytes on Python 2
Summary: Do not leak unicode to Python 2.

Reviewed By: simpkins

Differential Revision: D20269851

fbshipit-source-id: ebd1b0678b1335a951c9655210601dd80842336e
2020-03-05 07:35:26 -08:00
Liubov Dmitrieva
047862c02c mononoke: add 'repo_stack_info' API
Summary:
The new API is required for migration Commit Cloud off hg servers and infinitepush database

This also can fix phases issues with `hg cloud sl`.

Reviewed By: markbt

Differential Revision: D20221913

fbshipit-source-id: 67ddceb273b8c6156c67ce5bc7e71d679e8999b6
2020-03-05 05:48:32 -08:00
Alex Hornby
cbb3996141 mononoke: walker: fix waiting on tail
Summary:
Fix the tail interval delay, it wasn't triggering.

Took the opportunity to structure the code as a loop as well which simplified it a bit.

Reviewed By: markbt

Differential Revision: D20247077

fbshipit-source-id: 1786ef1528a4b0493f5e454d28450d7198af8ad4
2020-03-05 05:41:02 -08:00
Lukas Piatkowski
ddeeeb65e0 Re-sync with internal repository 2020-03-05 11:56:21 +01:00
Adam Simpkins
a0358352da remove an integration test for handling SIGKILL after SIGSTOP
Summary:
Remove a failing integration test that was testing behavior we don't really
care about.

My changes in D20210708 made this test start failing.  This integration test
was initially added to exercise the code I reverted in D20210708.

This test fails when EdenFS is invoked in the foreground and under sudo.  If
you send SIGSTOP to the EdenFS process sudo happens to notice this and send
the same signal to itself too.  This results in a state where the `sudo`
command is stopped and is never resumed so it never wakes up to reap its child
EdenFS process when EdenFS exits.  The behavior I reverted in D20210708 caused
the edenfsctl CLI code to simply ignore the fact that EdenFS was stuck in a
zombie state, and proceed anyway.  This allowed EdenFS to at least restart,
but it left old zombies stuck forever on the system.

This problem is arguably an issue with how sudo operates, and it's sort of
hard for us to work around.  To solve the problem you need to send SIGCONT to
the sudo process, but since it is running with root privileges you don't
normally have permission to send a signal to it.  It is understandable why
sudo behaves this way, since normally it is desirable for sudo to background
itself when the child is stopped.

In practice this isn't really ever a situation that we care much about
handling.  Normal users shouldn't ever get into this situation (they don't run
EdenFS in the foreground, and they generally don't run it under sudo either).

Reviewed By: genevievehelsel

Differential Revision: D20268924

fbshipit-source-id: d61d0a10ee1e132f00dbd2e4dc135808b7c79345
2020-03-04 22:15:49 -08:00
svcscm
24b4397f88 Updating submodules
Summary:
GitHub commits:

b678d1cb1b
495a7ee430
b422eebd35
122c7f535e
afb97094ae
dd87757acc
188a485afd
79afc6b105
d875dbecf7

Reviewed By: yns88

fbshipit-source-id: befd87e060a6562d9ed138940590ad58c5769626
2020-03-04 22:15:48 -08:00
svcscm
63acbfa710 Updating submodules
Summary:
GitHub commits:

e0952945a0
790e96f68d
1e7d5049b2
807748616d
15281d7253
05ad8b1149
6027aab01d
7c9d52e735
1b6810a828

Reviewed By: yns88

fbshipit-source-id: a5c89912017b02aae523ae68bdae0ca62b68fdcc
2020-03-04 20:07:37 -08:00
Durham Goode
e2ff8d5da2 infinitepush: remove transaction that spans pull
Summary:
D18538145 introduced a transaction that spans the entire infintepush
pull. This has a couple of unfortunate consequences:

1. hg pull --rebase now aborts the entire pull if the rebase hits a conflict,
since it's unable to commit the transaction.
2. If tree prefetching fails, it aborts the entire pull as well.

Tests seem to work fine if we scope down this lock.

Reviewed By: xavierd

Differential Revision: D20260480

fbshipit-source-id: d84228ababdb5572401645f74e78df035bf1461b
2020-03-04 19:49:26 -08:00
Jun Wu
bb1562604a dag: make some test APIs public in crate
Summary: Those will be reused by nameset::DagSet.

Reviewed By: sfilipco

Differential Revision: D20242563

fbshipit-source-id: 944e9a04aeb15439256ecea64355b67e326e5c89
2020-03-04 17:33:25 -08:00
Jun Wu
b8e1477401 nameset: impl Debug for other sets
Summary:
This is useful for `assert_eq!(format!("{:?}", set), "...")` tests.

It will be eventually exposed to Python as `__repr__`, similar to Python's
smartsets.

Reviewed By: sfilipco

Differential Revision: D20242562

fbshipit-source-id: 5373bb180db7cafebf273ace7cf2cb80fbfb8038
2020-03-04 17:33:25 -08:00
Jun Wu
fa069204e3 nameset: impl Debug for StaticSet
Summary:
In the Python world all smartsets have some kind of "debug" information. Let's
do something similar in Rust.

Related code is updated so the test is more readable.

Reviewed By: sfilipco

Differential Revision: D20242564

fbshipit-source-id: 7439c93d82d5d037c7167818f4e1125c5a1e513e
2020-03-04 17:33:24 -08:00
Yixian Jiang
19472ea493 Remove dependency on fbzmq::ResourceMonitor & sigar
Summary:
Replace the methods to get CPU and memory usage statistics:
- For the memory: use `VmRSS` of `/proc/[pid]/status`: http://man7.org/linux/man-pages/man5/proc.5.html
- For the CPU%: calculate the process is occupied how much percentage of the CPU time, use `getrusage()`: http://man7.org/linux/man-pages/man2/getrusage.2.html
   - Implemented like the sigar: https://our.intern.facebook.com/intern/diffusion/FBS/browse/master/third-party/sigar/src/sigar.c?commit=4f945812675131ea64cb3d143350b1414f34a351&lines=111-169
  - Formula:
    - CPU% = `process used time` during the period / `time period` * 100
    -  `time period` = current query timestamp - last query timestamp
    - `process used time` = current `process total time` - last query `process total time`
    - `process total time` = CPU time used in user mode + CPU time used in system mode // get from the API `ru_utime` and `ru_stime`

Remove the `fbzmq::ResourceMonitor` and `sigar`:
- Change and rename the UT
  - `ResourceMonitorTest.cpp` -> `SystemMetricsTest.cpp`
  - `ResourceMonitor` -> `SystemMetricsTest` in `openr/tests/OpenrSystemTest.cpp`
- Remove `ResourceMonitor` code and dependency for `Watchdog` and `ZmqMonitor`
- Remove `sigar` dependency used in building

Reviewed By: saifhhasan

Differential Revision: D20049944

fbshipit-source-id: 00b90c8558dc5f0fb18cc31a09b9666a47b096fe
2020-03-04 16:37:28 -08:00
svcscm
381d833bdd Updating submodules
Summary:
GitHub commits:

ea70ea7027
c1843615ce
c2ebbb881f
ba0e4d0acf

Reviewed By: yns88

fbshipit-source-id: 862dcc9407e135679db7514d6affcbaeab998723
2020-03-04 16:06:45 -08:00
Jun Wu
0ae5a59e9e indexedlog: fix metadata-only updates for Indexes
Summary:
Previously, `flush()` will skip writing the file if there are only metadata
changes. Fix it by detecting metadata changes.

This can potentially fix an issue that certain blackbox indexes are empty,
lagging and require scanning the whole log again and again. In that case,
the index itself is not changed (the root radix entry is not changed), but
only the metadata tracking how many bytes in Log the index covered
changed.

Reviewed By: sfilipco

Differential Revision: D20264627

fbshipit-source-id: 7ee48454a92b5786b847d8b1d738cc38183f7a32
2020-03-04 15:59:12 -08:00
Jun Wu
33d65ac5eb test-doctor: fix test on filesystems without symlink
Summary:
On filesystems without symlinks, the test fails because ln prints errors.

Fix the test by using `#if symlink`.

Reviewed By: DurhamG

Differential Revision: D20260904

fbshipit-source-id: 1d0ffcc7e95d2718087fb01297369ca276b59013
2020-03-04 15:52:53 -08:00