Commit Graph

561 Commits

Author SHA1 Message Date
Katie Mancini
76df592222 allow multiple prefixes for paths to be logged
Summary:
Currently we use a single path prefix to configure data fetch logging in eden
(i.e if the path of a file which we fetch is an extension of our configured
path, then we log that data fetch. )

There is some interest in extending this to multiple path prefixes, so that we
can log separate parts repo.

Reviewed By: StanislavGlebik

Differential Revision: D22877942

fbshipit-source-id: f6eb3dcb4fa460b4acab09677e972caf9421ddff
2020-09-02 22:54:23 -07:00
Xavier Deguillard
340866508b store: rename the Fuse fetch cause
Summary:
In the code base, "channel" is used to denote the OS mechanism that sends
EdenFS notifications. In macOS and Linux, that's Fuse, on Windows, that's
ProjectedFS. To avoid platform specific naming in ObjectedFetchContext, let's
rename the fetch cause enum.

Reviewed By: kmancini

Differential Revision: D23462460

fbshipit-source-id: 3ac68cdf4999e6a3b4ff4ee266f94e1f9736df39
2020-09-02 12:15:48 -07:00
Wez Furlong
154d7309c9 eden: introduce SpawnedProcess
Summary:
This commit introduces a new process spawning class derived
from the ChildProcess class in the watchman codebase.

`SpawnedProcess` is similar to folly::Subprocess but is designed around the
idea that we will use a system provided spawning API to start a process, rather
than assuming the use of `fork`.

`fork` is to be avoided because it can be expensive for processes with large
address spaces and also because it interacts poorly with threads on macOS.  In
particular, we see the objC runtime terminating our process in some scenarios
where fork and threads are mixed.

There are some important differences from `folly::Subprocess` and that means
that some assumptions and uses need to be altered slightly from their prior
workings.  For example, detaching a SpawnedProcess moves the responsibility of
waiting on the child to a periodic task as there is no way to detach via
posix_spawn without also using fork.

On the plus side, this commit allows unifying spawning between posix and
windows systems, which simplifies the code!

Reviewed By: xavierd

Differential Revision: D23287763

fbshipit-source-id: b662af1d7eaaa9ed445c42f6c5765ae9af975eea
2020-09-01 13:31:32 -07:00
Stanislau Hlebik
14527beaf4 add ObjectFetchContext with causeDetail field
Summary:
As previous diffs in the stack show there were at least one place in the
codebase which used incorrect object context logger and that resulted in "blind
spots" in undesired file fetches logging i.e. undesired file fetches were
logged, but neither pid nor cmd-line was logged.

There are quite a few places in the codebase that use null
object fetch context, and threading the correct object fetch context to all of
them might be hard. Threading the context is a bit annoying, so it would be good to know something like "EdenDispatcher code is responsible for most of the blind spots, so let's thread the correct context there first". Or it would be equally good to know that none of the null object context are responsible for blind spots.

This diff might help us decide where we need to thread real object fetch context
first. Instead of passing null object fetch context let's pass null object
fetch context with causeDetail field. This field will be logged to scuba (see
BackingStoreLogger::logImport code), and instead of getting "Unknown" interface
we'll get e.g. "Unknown - EdenDispatcher::create", and that would highlight
where we need to thread the context.

A note about implementation - getNullContextWithCauseDetail returns a raw pointer
which is expected to be static i.e. it should work similarly to current
getNullContext implementation. It's quite a hack, but allows us to get rid of
memory allocations (we'd have one memory allocation per place in the code where
getNullContextWithCauseDetail). Let me know if you are ok with this hack.

Reviewed By: kmancini

Differential Revision: D23422526

fbshipit-source-id: e576bba9fc09e160fc42771c7589cdd1694d93c0
2020-09-01 03:39:18 -07:00
Chad Austin
0683ab6586 fix hg importer test regression
Summary:
Enabling hg dynamicconfigs in D23309090 (d643f48c8c) changed the output of `hg
manifest --debug` and broke HgImportTest. Set TESTTMP to avoid
production configs.

Reviewed By: DurhamG

Differential Revision: D23335847

fbshipit-source-id: 7ffd0394aa7a8466b266000b18f8742ed4a6b53f
2020-08-27 09:44:37 -07:00
generatedunixname89002005287564@sandcastle1323.prn2.facebook.com
2961ea533b Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D23341246

fbshipit-source-id: a084d09f2c21c3dc515bbb2e6eaf150fc05f16a9
2020-08-26 13:47:00 -07:00
Zeyi (Rice) Fan
4d21d2dd8a clean up fs/win/utils/Process.h
Summary: Some clean up to do. `Process` will crash the entire process if `Pipe` is ever `std::nullptr`. So let's not give it a default argument `std::nullptr`.

Reviewed By: xavierd

Differential Revision: D22958765

fbshipit-source-id: 0c35e805f24a0d572bbc08efc97e59a37d0cbf88
2020-08-24 21:38:12 -07:00
Katie Mancini
3827b9787d add fetch type to data fetch logging
Summary:
Having the type of data fetched can help in debugging where these fetches are
comming from. In the currently logs figuring out if a data fetch is blob or
tree requires some manual work. When looking at a big bunch of fetches this is
not super practical.

So this includes this info in our logging.

Reviewed By: chadaustin

Differential Revision: D23243444

fbshipit-source-id: 9abe5180c5d2afc0d02b27ba6a6b76401e86556e
2020-08-21 17:38:14 -07:00
Katie Mancini
e1836f679c Add spaces to process name in data fetch logs
Summary:
Previously pieces of the command line for a process were seperated by `\0`.
This makes them a bit hard to read and also makes running queries on them
harder. Converts these `\0` back to spaces to fix this.

see https://fb.workplace.com/groups/edenfs/permalink/1446711485499079/ for
more motivation.

Reviewed By: wez

Differential Revision: D23266909

fbshipit-source-id: e4a9284e04039fcd971bed0d6e21d220e946acdb
2020-08-21 13:57:56 -07:00
Ailin Zhang
58561f9df2 use Eden native import in prefetchBlobs
Summary: Previously we use HgImporter prefetch request in `prefetchBlobs()`, but using `getBlob()` can give us more control over the prefetch process later. So now `getBlob()` is used in `prefetchBlobs()` when `useEdenNativePrefetch` is configured as true.

Reviewed By: kmancini

Differential Revision: D22984848

fbshipit-source-id: 0bd0b1c5b50bb16da36f188915904d0223827dc3
2020-08-13 21:09:09 -07:00
Xavier Deguillard
e5558221ac store: plumb CMD_CAT_TREE
Summary:
With Mercurial now supporting CMD_CAT_TREE for efficiently fetching and reading
trees, we can plumb this onto EdenFS. At startup time, we detect whether
Mercurial supports CMD_CAT_TREE and use that method, otherwise, we fallback to
the old CMD_FETCH_TREE.

Reviewed By: wez

Differential Revision: D23044953

fbshipit-source-id: 9aea5c5b82e97039a75ef18976a155dcb6e150bc
2020-08-12 08:17:25 -07:00
Katie Mancini
58d012d8b4 enable metadata fetching by manifest id
Summary:
Previously we fetched metadata by commit hash and path. We knew this would be a
little extra expensive, but turns out this is a lot extra expensive.

Wait why is it expensive?
In short: lots of extra lookups that are not satisfied by cache :(
In long:
1. Each piece of the path would require a read to fetch the fsnode for that tree.
So this means asking for the metadata of a/b/c/d/e means 5 reads.
2. Normally these reads could be cached, but often we would make these requests
with a commit hash for a draft commit. On the server side this info is not
cached for a draft commit, this means a lot of database reads and recalculating.
(Most of the real uses of metadata prefetching is when an engineer is working
on a local commit. We just use the commit hash of the commit the user was on
when fetching metadata for a tree, even if that tree hasn't changed since a public
commit. so this means lots of requests with draft commit hashes).

Fetching by manifest id we are able to bypass this sequential path look up.
(and even if we are on a draft commit, if the tree has not locally changed
since a public commit, the manifest id will be the same as the public commit
avoiding this whole draft commit issue).

This allows us to query scs with a manifest id for a tree.

Reviewed By: wez

Differential Revision: D22990687

fbshipit-source-id: aa81d67de1f1d04a14d174774ee216f5ac6be5ba
2020-08-10 23:53:10 -07:00
Xavier Deguillard
53c6c0befd win: rework the Pipe API a bit
Summary:
This makes it similar to the Unix one, which reduces the ifdef a tiny bit.
Ideally I'd want to move the pipe handling into its own class so callers won't
have to care about windows/linux specificities.

Reviewed By: fanzeyi

Differential Revision: D22954056

fbshipit-source-id: c92a25b6abe084a7c7496c0d6e07795779e0abad
2020-08-07 11:05:31 -07:00
Ailin Zhang
106ce3af12 record fetch in HgQueuedBackingStore
Summary:
after `startRecordingFetch()` is called by `HgQueuedBackingStore`, record the path for each fetched file. When `stopRecordingFetch()` is called, stop recording and return the collected file paths.

`startRecordingFetch()` and `stopRecordingFetch()` will be used in the next diff.

Reviewed By: chadaustin

Differential Revision: D22797037

fbshipit-source-id: a1fe30424d3c2884ffe139a0062b1e36328fd4fe
2020-08-04 06:50:45 -07:00
Ailin Zhang
40422c12be log fetch-heavy processes to Scuba at each 2000 more fetches
Summary:
Previously we log a process to Scuba when it does 2000 (fetchThreshold_) fetchs, but then in Scuba all processes have fetch_count = 2000. In order to see how many fetches a process really did approximately, we log the same process to Scuba every time it does 2000 more fetches.

Note: this change could make the total count of fetch-heavy events in Scuba inaccurate, as we log the same process more than once. So when users want to see how many fetch-heavy events happened, instead of setting "type = fetch_heavy", they should set exactly "fetch_count = 2000".

Reviewed By: chadaustin

Differential Revision: D22867679

fbshipit-source-id: ae3c768a8d3b03628db6a77263e715303a814e3d
2020-08-03 11:13:20 -07:00
Ailin Zhang
7f2329a3ff add space between command name and args when logging fetch heavy processes to Scuba
Summary:
Previously, fetch heavy event's cmdline was delimited by '\x00' when logged to Scuba. (for example: `grep--color=auto-rtest.`)
Now we replace \x00 with a space, so command name and args will be separated by space. ( `grep --color=auto -r test .` )

Reviewed By: kmancini

Differential Revision: D22772868

fbshipit-source-id: 4ab42e78c7bc786767eee3413b9586739a12e8ac
2020-07-31 11:42:51 -07:00
Katie Mancini
b71d531a39 add data fetch logging for prefetch
Summary:
This adds logging for files fetched in prefetch like was aleady added for
blob and tree fetches.

This is needed to log the fetches caused by the glob files thrift call. The
purpose of this to help debug the cause of unexpected data fetches (See
D22448048 for more motivation).

Reviewed By: genevievehelsel

Differential Revision: D22561619

fbshipit-source-id: 5ae78b99fb0c7d863d8223b93492b0d0210ddf9e
2020-07-26 23:09:40 -07:00
Katie Mancini
e97f1c7240 logging for thrift object fetch: checkout
Summary:
This adds logging for data fetches that come from the thrift checkout call to
help debug the cause of unexpected data fetches. (See D22448048 for more
motivation)

Reviewed By: chadaustin

Differential Revision: D22489504

fbshipit-source-id: 3b732a1e5627c2130f561ec0138a1df270e1925d
2020-07-26 23:09:40 -07:00
Katie Mancini
9fa13b9393 Create ObjectFetchContext for Thrift
Summary:
We have seen that some of the unexpected data fetches do not originate from
FUSE. This adds parity to the logging for data fetches that come from the thrift
interface. Adding this logging improves the overall observability of eden, and
will help us debug the cause of unexpected data fetching.

This introduces plumbing to allow logging data fetches that originate from
thrift requests.

Reviewed By: chadaustin

Differential Revision: D22448048

fbshipit-source-id: a39dde72467c4922c07c569c14fb499341d40258
2020-07-26 23:09:40 -07:00
Ailin Zhang
03a2308028 get priority from ObjectFetchContext in BackingStore
Summary: Previously, `BackingStore` and all its sub-classes' `getBlob` and `getTree` methods accepted both `ObjectFetchContext` and `ImportPriority`  as arguments. Now, `ImportPriority`  is removed because we can get the priority from `ObjectFetchContext `

Reviewed By: kmancini

Differential Revision: D22650629

fbshipit-source-id: e1b0c57a059f11504b28b2c17d698bb58f51e1ee
2020-07-24 08:24:02 -07:00
Ailin Zhang
ce28ec8caa deprioiritize when fetch count exceeds threshold
Summary:
check fetch count before `getPriority()` is used. If fetch count has exceeded `fetchThreshold_`, lower the priority by 1.

Note: this diff only guarantees that the `getPriority()` function is returning the lowered priority. How the returned value is used for scheduling is handled by `HgQueuedBackingStore`

Reviewed By: kmancini

Differential Revision: D22550640

fbshipit-source-id: c032f8f72ca658618ac118dfb3ad3dcae61e9735
2020-07-24 08:24:02 -07:00
Ailin Zhang
102d8586cc make ObjectFetchContext own a copy of ImportPriority
Summary: Previously `getPriority()` always returned a fixed priority. Now that we want `ObjectFetchContext` to lower its priority, it is better to make it have its own copy of `ImportPriority`.

Reviewed By: kmancini

Differential Revision: D22550587

fbshipit-source-id: 029c797def477ae4533f66cfba146a3972cfb65d
2020-07-24 08:24:02 -07:00
Ailin Zhang
faa691ec33 change default value of ImportPriority
Summary: This diff ensures that the value of ImportPriority is always positive and changes offset from 0 to half of the maximum offset to allow lowering offset in the future.

Reviewed By: kmancini

Differential Revision: D22550462

fbshipit-source-id: 69f45369359c7b2c39a0c6831e9b33982e26a16a
2020-07-24 08:24:02 -07:00
Ailin Zhang
20ae54a69b clear fetch counts using eden debug gc_process_fetch
Summary: add a thrift call to clear `pidFetchCount_` in `ObjectStore` and call it in `eden debug gc_process_fetch`. Users might want this command to start a new recording of process fetch counts.

Reviewed By: kmancini

Differential Revision: D22583430

fbshipit-source-id: eba7d63b08da5134fd09b7512895aba06f6a7ca5
2020-07-23 11:54:12 -07:00
Katie Mancini
c6900de010 quiet noisy scs proxy hash logging
Summary:
lower the debug level for scs proxy hash logging so that this does not cause so
much noise in the logs of production eden

Reviewed By: chadaustin

Differential Revision: D22668574

fbshipit-source-id: 1a7c2a4706514c0ef0bb848424681eef9316d296
2020-07-22 13:10:07 -07:00
Victor Zverovich
e3f4a56f6b Migrate to field_ref Thrift API
Summary:
We are unifying C++ APIs for accessing optional and unqualified fields:
https://fb.workplace.com/groups/1730279463893632/permalink/2541675446087359/.

This diff migrates code from accessing data members generated from unqualified
Thrift fields directly to the `field_ref` API, i.e. replacing

```
thrift_obj.field
```

with

```
*thrift_obj.field_ref()
```

The `_ref` suffixes will be removed in the future once data members are private
and names can be reclaimed.

The output of this codemod has been reviewed in D20039637.

The new API is documented in
https://our.intern.facebook.com/intern/wiki/Thrift/FieldAccess/.

drop-conflicts

Reviewed By: yfeldblum

Differential Revision: D22631599

fbshipit-source-id: 9bfcaeb636f34a32fd871c7cd6a2db4a7ace30bf
2020-07-21 11:23:35 -07:00
Lee Howes
63faf1efd6 Replace Future::getTry with Future::result
Summary:
Replace calls to Future<T>::getTry() with Future<T>::result().

This change is behaviour-neutral. It enables us to make the behavior of getTry
match expectations and be blocking, like get, and r-value qualfiied, like
SemiFuture<T>::getTry().

Reviewed By: chadaustin

Differential Revision: D22510034

fbshipit-source-id: bd45cc6a404293089371654829a63c81b1c706aa
2020-07-13 14:14:30 -07:00
Katie Mancini
614729cb49 Fetch aux data for blobs from scs
Summary:
Buck uses the content SHA-1 to identify each of the source files for a target.
During the parsing phase it needs these SHAs, though the content of the
files is not yet needed, and may never be needed if the file has already
been built and is in the buck cache.

Currently, if we do not already have metadata cached for a file when
requested we fetch the contents of the file, and compute the hash.

We want to avoid this.

Eventually this data will be available from the Mononoke EdenAPI server,
but for now we want a temporary solution to unblock the Buck team, and
ship benefits early.

Reviewed By: chadaustin

Differential Revision: D21820913

fbshipit-source-id: 56a7e32519f0fb04881518306d94aaed33527fd9
2020-07-10 16:03:32 -07:00
Katie Mancini
4962a7face tree metdata storage
Summary:
Prefetching metadata for the entries in a tree when we fetch it saves us
an extra round trip to the server to fetch a blob when only the metadata
for that blob is fetched. (This can happen often while parsing targets in
builds)

We will to prefetch the metadata for each of the entries in a tree when
we fetch the tree and store the metadata for each entry under that
entries id (to make looking up the entry metadata by its id quick)

However, we also don't want to unnecessarily fetch data from
the server if we already done so.

To accomplish this we will also store the metadata for each entry under the tree
id in the local store. This will: 1) allow us to check if we have already fetched
the metadata from the server when we are fetching a tree (we only have the
tree id easily available here to storing the metadata under the tree id makes
it much easier/less expensive to do this check). 2) allow us to refil the
metadata for each entry stored under that entries blob id if it has been cleared
from the local store (this may happen is the local store is gets large and gets
partially cleaned to reclaim space).

This implements the method to store tree metadata for all entries under the tree
id and under the blob id for each entry.

Reviewed By: chadaustin

Differential Revision: D22239173

fbshipit-source-id: d4e0ffd642ce0b4034188cfc4eeaf2ea05f54e77
2020-07-10 16:03:32 -07:00
Katie Mancini
af70a36a41 introduce metadata importer
Summary:
This will allow adding custom MetadataImporters in different eden builds.

DefaultMetadataImporter provides a no-op version of the interface to be
used by default.

Reviewed By: chadaustin

Differential Revision: D21960834

fbshipit-source-id: aec8a3627ab1223f74466b92a0ebe3290b67b7ed
2020-07-10 16:03:32 -07:00
Katie Mancini
26a7f3ad25 Wrap backing stores copy of local store in shared ptr
Summary:
Previously the BackingStore kept a raw pointer to the LocalStore. To do this we relied on EdenServer ensuring the lifetime of the LocalStore exceeds that of the BackingStore.

This makes the LocalStore pointer a shared pointer to explictly make sure that the LocalStores lifetime matches the BackingStores lifetime.

Reviewed By: chadaustin

Differential Revision: D22394597

fbshipit-source-id: c81cb26c6fc8f834bc46d8576ced06ba6a96ac2c
2020-07-10 16:03:32 -07:00
Katie Mancini
550400364d introduce tree metadata storage in local store
Summary:
This introduces a class to manipulate the metadata for all the entries in a
tree. This adds serialization and deserialization to this class so that it can
be written to the local store.

Why do we need this? We need some way to easily check when we have already
fetched metadata for a tree and do not need to refetch this from the server to
avoid expensive network requests. Later diffs add functionally to store the metadata
for tree entries in the local store under the tree hash using this class.

Reviewed By: chadaustin

Differential Revision: D21959015

fbshipit-source-id: 0c0e8750737f3076c1f9604d0319cab7f2658656
2020-07-10 16:03:32 -07:00
Katie Mancini
264d749d67 record ScsProxyHash in LocalStore during import
Summary:
In following diffs we will use scs to prefetch meta-data for files, so that this data
will be available with out fetching the file content (which will improve build times
on eden).

This builds up the proxy hash index that serves as a conversion between eden
specific identifiers and commit and path which we will use to index into scs.

Reviewed By: chadaustin

Differential Revision: D21820909

fbshipit-source-id: 17891f6772f49c7c183061d7a4df2fe0a3be9d25
2020-07-10 16:03:32 -07:00
Katie Mancini
dc94bc8916 create scs proxy hash
Summary:
In following diffs we will use scs to prefetch meta-data for files, so that this data
will be available with out fetching the file content (which will improve build times
on eden).

SCS indexes trees by an scs specific hash (blake2 content hash) or by the commit
hash and path. Since this is different from the eden hashes and mercurial
hashes, we need another index to go between the current ids we have in eden
and identifiers for scs.

This introduces a proxy hash that serves as this conversion. Because we have
commit hashes around in eden right now, this is an easier route to indexing
into scs currently.

Reviewed By: chadaustin

Differential Revision: D21237648

fbshipit-source-id: 79115ac034a5f062ae879713cd2c1a17f348c725
2020-07-10 16:03:31 -07:00
Chad Austin
1031c6a211 stop shipping hg_import_helper.py
Summary:
proxy_import_helper.py exists for compatibility with older EdenFS
builds. None of those builds are running anymore, so remove it.

Reviewed By: genevievehelsel

Differential Revision: D22451196

fbshipit-source-id: 4d258b3fafe13bb67bd11259f5d1193a7e5575e6
2020-07-09 11:28:07 -07:00
Katie Mancini
df2b9b9009 open all RocksDB column families for backwards compatabiliy
Summary:
From the Rocks DB documentation:

> When opening a DB in a read-write mode, you need to specify all Column
Families that currently exist in a DB. If that's not the case, DB::Open call
will return Status::InvalidArgument()

This can cause problems for us in a couple of situations:
- When we need to rollback from an eden version where we added a column to
our configuration for RocksDB
- When we delete a column from our configuration for RocksDB

To make sure we do not encounter this error we need to make sure that we still
open all the columns existing in the database, even if they are not in our
configured list of family columns.

Reviewed By: wez

Differential Revision: D22425310

fbshipit-source-id: 9822b22cfedf4633f65bbed96f95a546dd3614f4
2020-07-09 10:28:14 -07:00
Zeyi (Rice) Fan
21c11f4921 make FuseRequest high priority & TreePrefetchLease low priority
Summary:
This is much better than having `ObjectFetchContext` itself owns a copy of `ImportPriority`. We can actually customize how different fetch context manages these priority.

We set all FUSE requests to a higher priority and prefetch requests to a lower priority

Reviewed By: xavierd

Differential Revision: D22342802

fbshipit-source-id: b9c1d0f2ddbc7a5e5d619bc2c2222e5df0e702af
2020-07-02 12:00:45 -07:00
Zeyi (Rice) Fan
07452335fb use ObjectFetchContext for priority
Summary: This commit switch from explicitly specifying `ImportPriority` into passing priorities from `ObjectFetchContext`.

Reviewed By: xavierd

Differential Revision: D21872720

fbshipit-source-id: 26055eff21cab4ce6370e96ac3acbac2fd6af3f0
2020-07-02 12:00:45 -07:00
Zeyi (Rice) Fan
5030f0be35 add ImportPriority to ObjectFetchContext
Summary:
This commit adds `ImportPriority` to `ObjectFetchContext`. By doing so we can tweak priority for a request at different stage.

This commit also provides a default implementation for the virtual methods in `ObjectFetchContext` so we can create one to carry `ImportPriority` in some specific cases.

Reviewed By: chadaustin

Differential Revision: D21872718

fbshipit-source-id: 6e8cfd84959b368e6fe69fda2baf0debf7a88295
2020-07-02 12:00:45 -07:00
Ailin Zhang
7c603e51f8 make fetch heavy threshold configurable
Summary: This diff made fetch threshold configurable, so we can change it later as repository size grows.

Reviewed By: fanzeyi

Differential Revision: D22337850

fbshipit-source-id: 4b46420cb4e7164a3f1080279d67fa5f90549cd8
2020-07-02 08:44:02 -07:00
Ailin Zhang
36fd61dbaa send fetch heavy events to Scuba
Summary: This diff updated `ObjectStore` to send a `FetchHeavy` event to Scuba when the number of fetching requests of a process has reached 2000.

Reviewed By: fanzeyi

Differential Revision: D22292104

fbshipit-source-id: b7ac48412868216ea960c8681a5fb71c587952bc
2020-07-02 07:57:15 -07:00
Xavier Deguillard
bd26254f79 eden: fix windows build
Summary:
Both optional and pid_t weren't found and the right includes needed to be
provided. On Windows, the ProcessNameCache isn't compiled (yet), and since it
looks like the process name is optional in the BackingStoreLogger, let's not
provide it for now.

Reviewed By: fanzeyi

Differential Revision: D22215581

fbshipit-source-id: 31a7e7be62cd3d14108dc437d3dfabfb9e62f8d5
2020-06-24 15:12:47 -07:00
Katie Mancini
1876c4e77b adding logging for selective paths
Summary:
Eden can sometimes unexpectedly fetch files from the server, and we want
to know why this is happening. This adds logging for the source of
data fetching in edens backing store to help obviate why these fetches
are happening.

This temporarily adds the logging in the HgQueuedBacking store to get a naive
version of logging rolled out sooner. Follow up changes will move this logging
closer to the data fetching itself if possible (in HgDatapackStore and HgImporter).

Reviewed By: chadaustin

Differential Revision: D22012572

fbshipit-source-id: b1b012ce4ee133fbacecd586b7365c3c5a5386df
2020-06-23 10:02:41 -07:00
Katie Mancini
8d32611a23 add data fetch logger
Summary:
We have seen that eden will unexpectedly fetch data, we want to know why.

This adds the plumbing to interact with edens current logging to be able to
log when eden fetches data from the server and what caused eden to do this
fetch. Later changes will use the classes created here to log the cause of data
fetches.

Reviewed By: chadaustin

Differential Revision: D22051013

fbshipit-source-id: 27d377d7057e66f3e7a304cd7004f8aa44f8ba62
2020-06-23 10:02:41 -07:00
Katie Mancini
fdb1af8bc9 add cause info to objectFetchContext
Summary:
Recently the server team added an un-used directory to test that eden would not
fetch these as a test for the upcoming repo merge. They saw that these files
were fetched a non trivial number of times. We want to know why eden is causing
these fetches.

This adds the pid and interface through which the client is interacting with eden to
ObjectFetchContext for this purpose. This information will be logged in later
changes.

note: currently this is only for fetches through Fuse (thrift interface to follow).

Reviewed By: chadaustin

Differential Revision: D22050919

fbshipit-source-id: 49b93257a0e6d910f48b1e8ec6471527e056d22a
2020-06-23 10:02:40 -07:00
Katie Mancini
a0b05b4bf0 thread ObjectFetchContext to backing store
Summary:
This passes ObjectFetchContext into the backing store to prepare for adding
logging for the cause of server fetches.

In following changes I will add logging in the HgQueuedBackingStore.
Ultimately we will want to move this logging to be closer to the data fetching
(in HgDatapackStore and HgImporter), but I plan to temporarily add logging to
the HgQueuedBackingStore to simplify so that we can more quickly roll out.

Reviewed By: chadaustin

Differential Revision: D22022992

fbshipit-source-id: ccb428458cbf7a1e33aaf9be9d0d766c45acedb3
2020-06-23 10:02:40 -07:00
Katie Mancini
480277e328 refactor - move ObjectFetchContext to its own file
Summary:
In following changes I will be threading ObjectFetchContext into the backing
store importing process, since this will start to be used more outside of the
ObjectStore, I am moving this class into its own files.

Reviewed By: chadaustin

Differential Revision: D22022488

fbshipit-source-id: 1a291fea6e0fd56855936962363dfc9f6de8533d
2020-06-23 10:02:40 -07:00
Ailin Zhang
cec1cf648c make ObjectStore manage a PID-fetchCounts map
Summary: This diff adds a PID-fetchCounts map to `ObjectStore` and makes `ObjectStore` update that map after every `didFetch`

Reviewed By: kmancini

Differential Revision: D22100413

fbshipit-source-id: 740342c7b4a453fe482344c2db9542381c3772e4
2020-06-19 21:07:49 -07:00
Xavier Deguillard
baa6894151 store: reap importer when the importer thread dies
Summary:
On Windows (haven't verified on other platforms), ThreadLocalPtr don't appear
to be releasing resources when a thread die. This means that when the importer
thread dies, the actual importer (hg.real) would still run and use resources,
with no way of talking to it.

To fix this, let's manually reset it when the main thread function returns,
this forces the importer to be destroyed and therefore the various handles to
hg.real to be released, effectively terminating it.

I'm not sure if this is the proper fix, but delving into folly feels a bit
daunting. Keeping a TODO for later to go back to it and fix it properly in
folly.

Reviewed By: chadaustin

Differential Revision: D22012540

fbshipit-source-id: 99f994bb5128b38ccf8474031763b8a21055759a
2020-06-18 20:41:53 -07:00
Zeyi (Rice) Fan
ae0d08f884 record backing store import in RequestData
Summary:
Previously we check if a request is a fuse request when we fetch anything from backing store, so we can collect number of fetches happened for each process in eden top.

This is creating a dependency from store to fuse, which is a little awkward. Instead, we could make `RequestData` a `ObjectFetchContext` and record the fetches when that happens.

Similarly in the future we should also have something equivalent in our Thrift layer.

Reviewed By: kmancini

Differential Revision: D21775919

fbshipit-source-id: 95056830ddbe7c999051c43e0d8eca9a67350904
2020-06-18 10:40:40 -07:00