Commit Graph

24 Commits

Author SHA1 Message Date
Xavier Deguillard
ee80b1a67f integration: use count metric instead of rate
Summary:
The rate metric can be unreliable, and in some environment is never updated by
the time it is read by the test. Since this test cares about the number of
events, using count is a better metric, and more reliable.

Differential Revision: D30377128

fbshipit-source-id: f10c656567e3a3c07b66ecc6fc563a53199e3088
2021-08-17 17:29:33 -07:00
Zhengchao Liu
3eea6fbe7e allow eden stats output subset of stats and in JSON
Summary:
This adds the options to `eden stats` for collecting only fast stats and printing in JSON.

`eden stats` can be slow especially due to collecting fb303 counters and private bytes. An example use case of this new lightweight endpoint is that Buck can poll it to display Eden related info in its cli (see [post](https://fb.workplace.com/groups/132499338763090/permalink/210396380973385/) for context).

Reviewed By: xavierd

Differential Revision: D29687041

fbshipit-source-id: a663e71231527c5dfb822acbf238af0ac6ce4a00
2021-08-03 15:29:45 -07:00
Katie Mancini
9dee7d6c40 nfs: run custom integration tests on nfs
Summary:
There are a few tests that we don't currently run on both git and hg,
this means we need a new decorator to run the tests on nfs.

Reviewed By: xavierd

Differential Revision: D27953136

fbshipit-source-id: f6e1fab1f763c9b878315336b2cdaa529d36ffe5
2021-05-12 15:01:08 -07:00
Saurabh Singh
cfe084d02f telemetry: switch to using quantile stats instead of timeseries
Summary:
Timeseries is memory intensive and not really required in the current context
it is being used.

Reviewed By: chadaustin

Differential Revision: D26315632

fbshipit-source-id: ee51c3ad8bef6fce152aa787c8c4602f0b499f92
2021-02-14 16:37:08 -08:00
Chad Austin
3b04c02b29 fix flaky integration tests
Summary:
Under heavy parallelism or system load, our tests could trigger
short-ish timeouts and cause tests to flake. The stats test in
particular often failed in continuous integration. It looks like
opening a unix domain Thrift socket early and holding onto it can
cause it to sometimes hit ThriftServer's default idle timeout of 60
seconds, which results in the test failing with BrokenPipeError
(EPIPE).

Reviewed By: simpkins

Differential Revision: D21780023

fbshipit-source-id: 7e8838429475c2a322d836b9a497411199948cce
2020-05-29 11:49:37 -07:00
Katie Mancini
48804631f7 expose number pending fuse requests
Summary:
This sets up the counters that will allow us to expose the number
of pending FUSE requests in Eden top.

As D20846826 mentions adding metrics for FUSE request gives
visibility into fuse requests and overall health of eden.

This provides more insight beyond the metrics for live FUSE requests
since it shows the kernels view of FUSE requests. Looking at the difference
between the number of pending and live request, can identify issues
that arise at the interface between eden and FUSE and monitor how
quickly fuse workers are processing requests.

**note**: this is only for linux since macos has no equivalent to
`/sys/fs/fuse/connections`

Reviewed By: chadaustin

Differential Revision: D21074489

fbshipit-source-id: c0951f0dfd4fa764be28d8686d08cd0dd807db37
2020-04-28 13:28:01 -07:00
Katie Mancini
d850667e19 expose live FUSE requests
Summary:
This exposes metrics for the live FUSE requests (the duration
of the longest outstanding request and the number of outstanding
requests).

Because FUSE is the interface through which the user mostly interacts
with the file system they provide good metrics to judge if the perfomance
of eden is normal, or there may be an issue.

Exposing these counters this way will send them to ods, so it will not only
allow for debuging current issues, but can be used to look back at previous
problems. This data could also be used for alerting or more proactive
remediation.

Metrics are exposed per checkout to allow seeing which checkout was
having issues. This data will aggregated in `eden top` to be used as
an overall health indicator, but should more information be needed it
will be logged in ods.

Reviewed By: chadaustin

Differential Revision: D20922194

fbshipit-source-id: 16208883417acb77b62bf712cfdd9068c5420303
2020-04-22 12:33:34 -07:00
Adam Simpkins
ef8c3435c4 remove the "edenfsctl repository" subcommand
Summary:
We no longer use repository configs, so remove the `repository` subcommand
that supported adding and listing these configurations.

The main information that used to be included in the repository configuration
was the bind mount settings.  This has since been replaced with the
`.eden-redirections` file that is placed directly in each repository.

Reviewed By: wez

Differential Revision: D20876462

fbshipit-source-id: cc7d8e6f0a6a2e04fbf3159417af41a44908b3a8
2020-04-10 13:57:52 -07:00
Katie Mancini
e299b71988 Explicitly track current/pending imports -- refactor and expose current metrics
Summary:
As mentioned in D20629833, adding metrics for live
imports in `eden top` gives more transparency to the
imports process and makes identifying import related
issues easier. This is set up to expose metrics for live
imports like those for pending imports in `eden top`.

Similar to D20611728 exposing this via these counters
will log this data. Having this data persisted will allow
tracking the performance of imports, and does the set
up for more pro-active fixing of issues. Further we can
look back to see issues that are no longer occurring, but
still of interest.

This also refactors the registration code so that it requires
no copy pasting to add a new counter. Avoiding copy paste
errors when adding more counters and making it easier to
maintain.

Reviewed By: chadaustin

Differential Revision: D20630813

fbshipit-source-id: 8a7a2a0135c7b7a5cde960b84dcb434c6c99eaeb
2020-04-09 12:35:22 -07:00
Katie Mancini
944a471065 expose the longest outstanding import
Summary:
The primary purpose of exposing counters for the maximum length of duration is to add this to `eden top`.

As discussed in D20611704, the duration of time that imports are queued for is a strong indicator of the health of the import process. Adding it to `eden top` will help give users insight into what eden is doing when it hangs for a long time or when there is an issue, indicate whether that issue is due to the import process.

Additionally we choose to use these counters to expose this data because we want to log this data. This data can be used to measure performance of the queueing and import process. In the future this data could be used to set up alerting for regressions and allow more proactive fixes.

Reviewed By: chadaustin

Differential Revision: D20611728

fbshipit-source-id: 9307c1ad749ac5fe356ba9eaf868de41b1a8a3a7
2020-04-01 09:54:30 -07:00
Katie Mancini
b0a5ffb100 Expose number of imports queued
Summary: expose the counters for number of pending imports (blobs, trees, prefetches) to allow use in tooling

Reviewed By: chadaustin

Differential Revision: D20269853

fbshipit-source-id: d2b7e2110520290751699c4a891d41ebd5b374cf
2020-03-06 09:28:55 -08:00
Katie Mancini
52e211fe8e Record Mercurial file import time
Summary:
- added logging only around the import blob call to capture non-queue related wait time
- added to `test_reading_file_gets_file_from_hg` in `integration.stats_test.HgBackingStoreStatsTest`  to test import blob logging in addition to the get blob loging

(not yet done for importing trees, will do in next diff)

Reviewed By: chadaustin

Differential Revision: D20201215

fbshipit-source-id: c89281fe7d3d6e89d111ac8cce9014adff44ac40
2020-03-03 11:44:27 -08:00
Chad Austin
3d3c2a9f32 stop storing sizes in their own local store column
Summary:
Remove some half-baked, unnecessary logic for caching sizes separately
from SHA-1. Eden's backing stores do not support chunking large files
yet, so there's no value in caching content SHA-1 and size
separately. This fixes a scenario where fetching blob size and then
SHA-1 would result in two backing store imports.

Reviewed By: fanzeyi

Differential Revision: D19169096

fbshipit-source-id: dc32f3313e5f4230c06a5bbaa67da7bf0febaba8
2019-12-20 16:14:23 -08:00
Andres Suarez
fbdb46f5cb Tidy up license headers
Reviewed By: chadaustin

Differential Revision: D17872966

fbshipit-source-id: cd60a364a2146f0dadbeca693b1d4a5d7c97ff63
2019-10-11 05:28:23 -07:00
Jake Crouch
a253e8045b Fix flaky counter mount/unmount test
Summary: The CountersTest would previously fail if by chance the counters prefixed by "thrift" and "thrift_client" were accounted for between getting "counters" and "counters2", since these counters should not be modified when mounting/unmounting mounts we will just filter them out.

Reviewed By: chadaustin

Differential Revision: D16265511

fbshipit-source-id: 21af0dff345977692785136ca0333d23d5c77e0d
2019-07-17 12:16:18 -07:00
Jake Crouch
277da58973 Unregister journal stats callbacks
Summary: I found out that the journal stats callbacks that were getting registered were not getting unregistered, this diff fixes that.

Reviewed By: chadaustin

Differential Revision: D16187569

fbshipit-source-id: 8c84e1515e376ccd7036a22c06e2e6b98dc62342
2019-07-12 12:27:28 -07:00
Brian Strauch
3986ddb614 ObjectStore stats
Summary: Added the cli command `eden stats object-store` for querying the counts on what part of the object store was responsible for finding the blob or blob size (local store or backing store). This will tell us how well local and in-memory caching works for different workflows.

Reviewed By: chadaustin

Differential Revision: D15934535

fbshipit-source-id: 70345f11a51c3c6996dc001d4101744395a3d182
2019-07-01 12:49:57 -07:00
Zeyi (Rice) Fan
a38b05612d concurrently importing blobs from Mercurial and Mononoke
Summary: This diff takes care of importing blob from Mononoke and Mercurial at the same time, also improves the name situation in the statistics counters.

Reviewed By: strager

Differential Revision: D15768557

fbshipit-source-id: 10cf831b1ae6dc9e6b91f1e96508c4fa92583743
2019-06-24 13:45:02 -07:00
Adam Simpkins
9bfb48c921 update license headers in .py files
Summary:
Update the copyright & license headers in Python files to reflect the
relicensing to GPLv2

Reviewed By: wez

Differential Revision: D15487088

fbshipit-source-id: 9f2138dff41048d2c35f15e09a04ae5a9c9c80dd
2019-06-19 17:02:46 -07:00
Jake Crouch
0dc6812f33 Print out Journal Info with edenfsctl
Summary: Print out memory usage and entry counts with edenfsctl stats

Reviewed By: chadaustin

Differential Revision: D15607015

fbshipit-source-id: 866960ea1d3b5e9fdbe24df3b57fba419795ec76
2019-06-07 13:37:02 -07:00
Matt Glazar
1ec3027f1e Fix flakiness in FUSE stats tests
Summary:
test_reading_committed_file_bumps_read_counter is flaky with optimized builds of edenfs. I think it's flaky because FuseChannel bumps counters *after* responding to the kernel, so the test can call get_counters before the counters are bumped.

Fix the flakiness by making the test wait a while for the counters to change.

Reviewed By: chadaustin

Differential Revision: D15550972

fbshipit-source-id: 891e5d0a9748b43eb0ef1089ef6bc0a547c47d4d
2019-05-30 18:13:24 -07:00
Matt Glazar
1775680391 Test FUSE and HgBackingStore stat collection
Summary:
I am refactoring edenfs' EdenStats class. In doing so, I accidentally removed a call to `aggregate`, causing `flushStatsNow` to not expose accurate counters. This wasn't caught by any existing tests, only by manual testing.

Add some tests to prevent FUSE and HgBackingStore statistics collection from completely regressing.

Reviewed By: simpkins

Differential Revision: D15274275

fbshipit-source-id: c8a9c9848dd60aee7f252a93f10ddce6d7560799
2019-05-28 15:43:19 -07:00
Matt Glazar
9e6a39e951 Factor duplicate code into EdenTestCase.get_counters
Summary:
In another diff, Adam Simpkins noticed that HgImporterStatsTest.get_counters was clunky and that the logic belongs in the EdenTestCase base class.

Hoist HgImporterStatsTest.get_counters into EdenTestCase. Also avoid reusing the Thrift client between calls to get_counters because edenfs might restart between calls (in which case the old Thrift client won't work).

Reviewed By: wez

Differential Revision: D15514537

fbshipit-source-id: 0ae25106baa0e5b2d857b0bb2552d884b9b270ef
2019-05-28 15:43:18 -07:00
Matt Glazar
37a993bf42 Track how many HgImporter requests are made
Summary:
Sometimes, EdenFS goes bonkers and talks to 'hg debugedenimporthelper' a lot for seemingly no reason.

Make these situations easier to debug by counting how many requests EdenFS makes to 'hg debugedenimporthelper'. These counters let us answer questions such as the following:

* When performance sucks, is EdenFS is making a lot of requests or only a few requests? I.e. is Hg slow at responding or is EdenFS very demanding?
* Does a recent performance issue correlate with EdenFS communicating with 'hg debugedenimporthelper'?
* Which engineers are outliers having orders of magnitude more 'hg debugedenimporthelper' requests than p50 engineers?

We could get fancier with these counters and include the number of bytes received, the duration of the request, etc. For now, just having a request count is useful.

Reviewed By: simpkins

Differential Revision: D14677339

fbshipit-source-id: 7f8f394fb0096aef65d6a8a45d7da5936db539a0
2019-04-10 19:58:16 -07:00