Summary:
This is adds file access logging in the Inode layer of the code. We log the cause (channel, thrift, unknown) and the cause detail (which is the FUSE opcode for fuse, function name for thrift). We do the work on a worker thread since we need to traverse the inode map to translate an inode number to a file path. The workerthread in this code is largely based off of the one in the ProcessNameCache.
At this moment, I am not worrying about logging inode creation as a file access. The nice thing about putting this in FileInode is that it should be able to be used by NFS for free.
I've left a TODO in the code about not logging files that match gitignore patterns since its not a hard blocker for this to land.
Reviewed By: kmancini
Differential Revision: D29128258
fbshipit-source-id: 3e08a3567fed937a381b58847ea83569d70f0ea2
Summary: NfsTaskQueue can be made more generic to be shared across the codebase, so this makes it its own target in `eden/fs/utils` w/ the name EdenTaskQueue.
Reviewed By: xavierd
Differential Revision: D29244762
fbshipit-source-id: 78348f2ff8fa66bc801aefe7d6b3905e0da278e8
Summary: Just to be more specific, lets be explicit about that this is the serverThreadPool that the mount is returning.
Reviewed By: xavierd
Differential Revision: D29244719
fbshipit-source-id: 015a2e8198c418c6fb5a89234274eefb329848fe
Summary: In order to access the kill switch for file access logging, we need to store a EdenConfig pointer in the HiveLogger.
Reviewed By: xavierd
Differential Revision: D29120227
fbshipit-source-id: c828bbbf551c7096877bfa763edc29ef133afb41
Summary: This creates a HiveLogger and threads it through to the ServerState.
Reviewed By: xavierd
Differential Revision: D29119501
fbshipit-source-id: d0b74733d8832d604f8f4cf0af28e767dbddeddf
Summary: This just adds a method to get the repoName out of the EdenMount for logging purposes.
Reviewed By: xavierd
Differential Revision: D29113121
fbshipit-source-id: 9ebb4ac588839a99f25f4e884a7a132f5793e49e
Summary: Looks like another test needs updating after D29437983 (5b1b1febc1).
Reviewed By: quark-zju
Differential Revision: D29446063
fbshipit-source-id: c71684609830b6ec047a2efc615b806dd2a0066b
Summary:
With lazy pull fast path, the legacy commit scan will trigger one-by-one id
resolution, which is terrible and defeat the purpose of laziness. Let's just
force graphql path instead.
Reviewed By: andll
Differential Revision: D29440142
fbshipit-source-id: 4b0d10e6f5a3eb0533b2d95b1889a1f1c1f281cb
Summary:
Similar to D29404057 (cedddd1c8d), add a way to disable resolving IDs by setting
a limit using `EDENSCM_REMOTE_ID_THRESHOLD`.
Reviewed By: andll
Differential Revision: D29440143
fbshipit-source-id: 30409089493ae2cd5c189e37b0d4f88df9a6d8e8
Summary: The Rust NameSet does not have nullid. Do not bother resolving remotely.
Reviewed By: kulshrax
Differential Revision: D29437983
fbshipit-source-id: f1eb73e738281b3b5096fba22ba92833f5a4f3ee
Summary:
See this graph - https://fburl.com/scuba/mononoke_blobstore_trace/77nyw174. It
shows that there was a huge jumps in the number of "hgmanifest" fetches from
scs that didn't exist in the manifold. This is not great for mononoke, because
it uses a significant chunk of our manifold qps capacity (~30K qps on average,
with spikes going up to 100K qps). The main suspect here is
repo_list_hg_manifest method, because few other scs methods (if any) fetch hg
data.
It correlates with the release of new eden version on June, 23, which added
getTreeBatch() method, and looks like it was passing manifest hash and eden id
in the wrong order. See https://fburl.com/code/tsjwss8a
```
proxyHash.revHash(), // this is really the manifest node
```
This is a comment in `getTree()` method which says that `proxyHash.revHash()`
is manifest id, which is exactly what we need to send to scs.
[ScsMetadataImport::getTreeMetadata](https://fburl.com/code/xammvq2k) suggests
that the second parameter should be manifestId (i.e. proxyHash.revHash()), but
we were passing them in the wrong order. Let's fix it.
Reviewed By: xavierd
Differential Revision: D29428581
fbshipit-source-id: d041008f00c7519504c6c67173ca85709e9dc415
Summary:
The remotenames information might be out of sync with the DAG.
Use the DAG's master group's head for the pull fast path "old" master so that
even if the remotenames is stale for some reason, the fast path still works.
Reviewed By: andll
Differential Revision: D29434719
fbshipit-source-id: 7bbd0757dc6a9428f86663ac4989f7c30b365b46
Summary: Will be used by the next change.
Reviewed By: andll
Differential Revision: D29434722
fbshipit-source-id: 74dbec506fb0985379480815380118cd41058aec
Summary:
If a user Ctrl+C-ed (or something else interrupts) between flushing changelog
and flushing metalog (remotenames). The pulled commits will exist in the local
graph and break the next time using the fast path.
Reviewed By: andll
Differential Revision: D29434720
fbshipit-source-id: e7ca7542691279644679effb06707a1d305541af
Summary:
This is a better practice. It avoids debug messages from other places (ex.
discovery).
Reviewed By: andll
Differential Revision: D29434721
fbshipit-source-id: 5990f0174e9b61aed7d9d56252df63b571364070
Summary:
The remotenames has logic to work with legacy clones to write the remotenames
state. However, modern clone paths take care of remotenames directly so there
is no need to do an extra pull by remotenames.
Reviewed By: DurhamG
Differential Revision: D29428883
fbshipit-source-id: a73c0ee716b09f4e34d6fa30997f961284678d13
Summary:
This makes it pick up some `%include` configs and can avoid some surprises.
Also remove the manual `paths.default` setconfig. That seems to cause
`destrepo.edenapi` being `None` and break lazy changelog clone paths.
Reviewed By: DurhamG
Differential Revision: D29428882
fbshipit-source-id: bafeb195a560e35be17355b793613b60d97fbecf
Summary: Make dynamicconfig pick up the repo name because hgrc gets written.
Reviewed By: DurhamG
Differential Revision: D29428878
fbshipit-source-id: fbf578cd7c770a4541fff3b85ff40c40cd5a6cc5
Summary: This will be used by the next change.
Reviewed By: DurhamG
Differential Revision: D29428879
fbshipit-source-id: 69e0ffac12fb9c442488d59ea8faa0ea4b47a2c1
Summary: This makes it easier to figure out a traceback using an empty repo name.
Reviewed By: DurhamG
Differential Revision: D29428881
fbshipit-source-id: 95a09c691e3d921ad4f960a39002f71ec879d927
Summary:
This makes it easier to debug wrong reponame issues.
In theory those need to be checked and the config needs to be regenerated if
they are changed for correctness. Given that username and reponame are rarely
changed, I saved it for later.
Reviewed By: DurhamG
Differential Revision: D29428880
fbshipit-source-id: f996af6a7a1e329faaa8b0a53dac8621fa94dac8
Summary: Left over of D29404901 (410769c529).
Reviewed By: DurhamG
Differential Revision: D29429146
fbshipit-source-id: b37c89745d924efc28110d8b96e9b51162b6570b
Summary: This avoids issues in assertions used by metadataonlyctx.
Reviewed By: DurhamG
Differential Revision: D29430672
fbshipit-source-id: 4de305d62f65a9d6c4fb2e3f375ef20d9d94d41c
Summary:
We do not respect `$PAGER`, and the default is the internal streampager.
Also document streampager config options.
Reviewed By: markbt
Differential Revision: D29412598
fbshipit-source-id: 706f09f8805e9067b46eec7d83c6742b61084b09
Summary:
See also D29400532 (909411bb1c). It turns out that it might be more desirable to just mix
stdout and stderr streams in streampager. For example, having them mixed then
the graph log output can show what network fetches or calculations are done
before outputting the graph lines. This is also more consistent with the
vanilla terminal (no pager) behavior.
Reviewed By: markbt
Differential Revision: D29412531
fbshipit-source-id: c07f68b12498a7cee6152bbecbb58d5a7e64097a
Summary:
Previously it wasn't possible because symlink target was a key in the map that
mega_grepo_sync was sending to scs, and so we can't have two different symlink
for the same symlink target. However we actually need it - some of aosp repos
have symlink different sources that point to the same symlink target.
This diff fixes it by reverting the key and valud in the `linkfiles` map.
Differential Revision: D29359634
fbshipit-source-id: da74d6e934350822d82d2135ab06c754824525c9
Summary:
This is just updating the os_info crate to my fork with a fix for Centos
Stream: https://github.com/stanislav-tkach/os_info/pull/267
Reviewed By: quark-zju
Differential Revision: D29410043
fbshipit-source-id: 3642e704f5a056e75fee4421dc59020fde13ed5e
Summary: Move the scmstore implementation from the `specialized` module to the root of the `scmstore` module.
Reviewed By: kulshrax
Differential Revision: D29405779
fbshipit-source-id: ae2ef9cc05337a0ff81f5ba5b7051792207fee82
Summary: `scmstore` is dead, long live `scmstore`.
Reviewed By: kulshrax
Differential Revision: D29405613
fbshipit-source-id: 3252a545f5b944d14c15b2a777b84a99a2d4c293
Summary: Update unit tests in `revisionstore::indexedlogdatastore` to use new scmstore instead of old scmstore.
Reviewed By: kulshrax
Differential Revision: D29405258
fbshipit-source-id: 3d2e8cd313dbe66a257433702402804f490bdf47
Summary:
Update unit tests in `revisionstore::edenapi::data` to use new scmstore. There's not really a wrapper to exercise anymore for edenapi specifically, so it's probably better to just make these `scmstore` unit tests instead of edenapi (or indexedlogdatastore as in the next change)-specific.
For ease of unit testing, make fetch_logger optional and introduce `empty` constructor.
Reviewed By: kulshrax
Differential Revision: D29397495
fbshipit-source-id: d7ef0df16cf83a2506606c55c78fcbfa684904d7
Summary:
The server is expected to provide head (of all segs), parents (of each seg),
roots (of all segs). We checked roots and parents but only check head in debug
build. Let's check head in release build too.
Reviewed By: andll
Differential Revision: D29405816
fbshipit-source-id: 1a97eb52a9a0d1d444ae5dabd1a01f0786be9fa9
Summary:
The external grep command (originally from fb-hgext's tweaks of grep command)
was run without redirecting its stdout. That could be problematic for
streampager, because writing directly to stdout will mess up with streampager
interface.
For example, the following command will show "nothing", while it should print
something:
fbcode/antlir/fbpkg/build % lhg grep -l scuba --config pager.pager=internal:streampager --config grep.usebiggrep=False --config pager.interface=full
Fix it by using a PIPE and forward the external command's output to streampager
buffer properly.
Reviewed By: andll
Differential Revision: D29408821
fbshipit-source-id: 4b2a0a6bbd64aa00d09f921d830b173cc56ae630
Summary:
Found by xavierd. Recent `os_info` bump now detects CentOS as OracleLinux.
Workaround it to keep our repo functional.
Reviewed By: xavierd
Differential Revision: D29410415
fbshipit-source-id: 1bd8183f46e3c2265aef119e9f96d9d05a5dbae6
Summary: I think someone landed a dependency change or something and forgot to update autocargo
Reviewed By: dtolnay
Differential Revision: D29402335
fbshipit-source-id: e9a4906bf249470351c2984ef64dfba9daac8891
Summary: Previously, hg would always try to fetch data via EdenAPI, even when doing so wouldn't make sense (for example, a local test repo). The only time it makes sense to try to use EdenAPI at all is when the repo in question is supported by Mononoke. As a quick heuristic, let's disable EdenAPI if `paths.default` is not set to a `mononoke://` URL.
Differential Revision: D29372739
fbshipit-source-id: b3601e529ed44c355cb192aedf4332317f4e3132
Summary: Add an option to allow manually forcing EdenAPI to be enabled or disabled. This is useful in a variety of cases, such as bypassing the normal EdenAPI activation logic in tests, or to forcibly disable EdenAPI in cases where it isn't working correctly.
Differential Revision: D29377923
fbshipit-source-id: f408efe2a46ef3f1bd2914669310c3445c7d4121
Summary:
Users get used to hints at the end of the output, not in a separate panel in
streampager. Make it so.
Reviewed By: andll
Differential Revision: D29400532
fbshipit-source-id: 41dc5d20f44a315bc0235ca3cb7857ae8955d3ad
Summary: This will be used by the next change.
Reviewed By: andll
Differential Revision: D29400533
fbshipit-source-id: e6b90bedd8d8a6cf9452dfb5c5f14f9980e12f62
Summary:
More straightforward way of D29404055 (e6ea02372c). Return the non-lazy set directly from
Rust. This avoids some overheads.
Note: ignoring whitespace will make reviewing easier.
Reviewed By: andll
Differential Revision: D29404901
fbshipit-source-id: 02e4766256863fe3fe258bcb318473355cd1efe4
Summary:
Since the overlay file contains a small header, we need to make sure to account
for it when calling fallocate, otherwise the file would be created 64-bytes too
small.
Reviewed By: kmancini
Differential Revision: D29382512
fbshipit-source-id: 24c49984b6cf2080f2a5b1fbb4796e4a1806f96a
Summary: This was used to narrow down issues.
Reviewed By: andll
Differential Revision: D29404054
fbshipit-source-id: 3bfdac332d63bdb13f40d5cf23dacec242b46d52
Summary:
Running `hasnode` in a loop is inefficient with lazy changelog.
Update it so it does batch filtering and does not ask remote service.
With the change, BackupState.heads will only contain locally known heads.
So the extra filtering is removed.
Reviewed By: andll
Differential Revision: D29404056
fbshipit-source-id: 327bd4c292ad75fd14a47587482440b6ece1d2d5