Summary:
Under heavy parallelism or system load, our tests could trigger
short-ish timeouts and cause tests to flake. The stats test in
particular often failed in continuous integration. It looks like
opening a unix domain Thrift socket early and holding onto it can
cause it to sometimes hit ThriftServer's default idle timeout of 60
seconds, which results in the test failing with BrokenPipeError
(EPIPE).
Reviewed By: simpkins
Differential Revision: D21780023
fbshipit-source-id: 7e8838429475c2a322d836b9a497411199948cce
Summary:
This sets up the counters that will allow us to expose the number
of pending FUSE requests in Eden top.
As D20846826 mentions adding metrics for FUSE request gives
visibility into fuse requests and overall health of eden.
This provides more insight beyond the metrics for live FUSE requests
since it shows the kernels view of FUSE requests. Looking at the difference
between the number of pending and live request, can identify issues
that arise at the interface between eden and FUSE and monitor how
quickly fuse workers are processing requests.
**note**: this is only for linux since macos has no equivalent to
`/sys/fs/fuse/connections`
Reviewed By: chadaustin
Differential Revision: D21074489
fbshipit-source-id: c0951f0dfd4fa764be28d8686d08cd0dd807db37
Summary:
This exposes metrics for the live FUSE requests (the duration
of the longest outstanding request and the number of outstanding
requests).
Because FUSE is the interface through which the user mostly interacts
with the file system they provide good metrics to judge if the perfomance
of eden is normal, or there may be an issue.
Exposing these counters this way will send them to ods, so it will not only
allow for debuging current issues, but can be used to look back at previous
problems. This data could also be used for alerting or more proactive
remediation.
Metrics are exposed per checkout to allow seeing which checkout was
having issues. This data will aggregated in `eden top` to be used as
an overall health indicator, but should more information be needed it
will be logged in ods.
Reviewed By: chadaustin
Differential Revision: D20922194
fbshipit-source-id: 16208883417acb77b62bf712cfdd9068c5420303
Summary:
We no longer use repository configs, so remove the `repository` subcommand
that supported adding and listing these configurations.
The main information that used to be included in the repository configuration
was the bind mount settings. This has since been replaced with the
`.eden-redirections` file that is placed directly in each repository.
Reviewed By: wez
Differential Revision: D20876462
fbshipit-source-id: cc7d8e6f0a6a2e04fbf3159417af41a44908b3a8
Summary:
As mentioned in D20629833, adding metrics for live
imports in `eden top` gives more transparency to the
imports process and makes identifying import related
issues easier. This is set up to expose metrics for live
imports like those for pending imports in `eden top`.
Similar to D20611728 exposing this via these counters
will log this data. Having this data persisted will allow
tracking the performance of imports, and does the set
up for more pro-active fixing of issues. Further we can
look back to see issues that are no longer occurring, but
still of interest.
This also refactors the registration code so that it requires
no copy pasting to add a new counter. Avoiding copy paste
errors when adding more counters and making it easier to
maintain.
Reviewed By: chadaustin
Differential Revision: D20630813
fbshipit-source-id: 8a7a2a0135c7b7a5cde960b84dcb434c6c99eaeb
Summary:
The primary purpose of exposing counters for the maximum length of duration is to add this to `eden top`.
As discussed in D20611704, the duration of time that imports are queued for is a strong indicator of the health of the import process. Adding it to `eden top` will help give users insight into what eden is doing when it hangs for a long time or when there is an issue, indicate whether that issue is due to the import process.
Additionally we choose to use these counters to expose this data because we want to log this data. This data can be used to measure performance of the queueing and import process. In the future this data could be used to set up alerting for regressions and allow more proactive fixes.
Reviewed By: chadaustin
Differential Revision: D20611728
fbshipit-source-id: 9307c1ad749ac5fe356ba9eaf868de41b1a8a3a7
Summary: expose the counters for number of pending imports (blobs, trees, prefetches) to allow use in tooling
Reviewed By: chadaustin
Differential Revision: D20269853
fbshipit-source-id: d2b7e2110520290751699c4a891d41ebd5b374cf
Summary:
- added logging only around the import blob call to capture non-queue related wait time
- added to `test_reading_file_gets_file_from_hg` in `integration.stats_test.HgBackingStoreStatsTest` to test import blob logging in addition to the get blob loging
(not yet done for importing trees, will do in next diff)
Reviewed By: chadaustin
Differential Revision: D20201215
fbshipit-source-id: c89281fe7d3d6e89d111ac8cce9014adff44ac40
Summary:
Remove some half-baked, unnecessary logic for caching sizes separately
from SHA-1. Eden's backing stores do not support chunking large files
yet, so there's no value in caching content SHA-1 and size
separately. This fixes a scenario where fetching blob size and then
SHA-1 would result in two backing store imports.
Reviewed By: fanzeyi
Differential Revision: D19169096
fbshipit-source-id: dc32f3313e5f4230c06a5bbaa67da7bf0febaba8
Summary: The CountersTest would previously fail if by chance the counters prefixed by "thrift" and "thrift_client" were accounted for between getting "counters" and "counters2", since these counters should not be modified when mounting/unmounting mounts we will just filter them out.
Reviewed By: chadaustin
Differential Revision: D16265511
fbshipit-source-id: 21af0dff345977692785136ca0333d23d5c77e0d
Summary: I found out that the journal stats callbacks that were getting registered were not getting unregistered, this diff fixes that.
Reviewed By: chadaustin
Differential Revision: D16187569
fbshipit-source-id: 8c84e1515e376ccd7036a22c06e2e6b98dc62342
Summary: Added the cli command `eden stats object-store` for querying the counts on what part of the object store was responsible for finding the blob or blob size (local store or backing store). This will tell us how well local and in-memory caching works for different workflows.
Reviewed By: chadaustin
Differential Revision: D15934535
fbshipit-source-id: 70345f11a51c3c6996dc001d4101744395a3d182
Summary: This diff takes care of importing blob from Mononoke and Mercurial at the same time, also improves the name situation in the statistics counters.
Reviewed By: strager
Differential Revision: D15768557
fbshipit-source-id: 10cf831b1ae6dc9e6b91f1e96508c4fa92583743
Summary:
Update the copyright & license headers in Python files to reflect the
relicensing to GPLv2
Reviewed By: wez
Differential Revision: D15487088
fbshipit-source-id: 9f2138dff41048d2c35f15e09a04ae5a9c9c80dd
Summary:
test_reading_committed_file_bumps_read_counter is flaky with optimized builds of edenfs. I think it's flaky because FuseChannel bumps counters *after* responding to the kernel, so the test can call get_counters before the counters are bumped.
Fix the flakiness by making the test wait a while for the counters to change.
Reviewed By: chadaustin
Differential Revision: D15550972
fbshipit-source-id: 891e5d0a9748b43eb0ef1089ef6bc0a547c47d4d
Summary:
I am refactoring edenfs' EdenStats class. In doing so, I accidentally removed a call to `aggregate`, causing `flushStatsNow` to not expose accurate counters. This wasn't caught by any existing tests, only by manual testing.
Add some tests to prevent FUSE and HgBackingStore statistics collection from completely regressing.
Reviewed By: simpkins
Differential Revision: D15274275
fbshipit-source-id: c8a9c9848dd60aee7f252a93f10ddce6d7560799
Summary:
In another diff, Adam Simpkins noticed that HgImporterStatsTest.get_counters was clunky and that the logic belongs in the EdenTestCase base class.
Hoist HgImporterStatsTest.get_counters into EdenTestCase. Also avoid reusing the Thrift client between calls to get_counters because edenfs might restart between calls (in which case the old Thrift client won't work).
Reviewed By: wez
Differential Revision: D15514537
fbshipit-source-id: 0ae25106baa0e5b2d857b0bb2552d884b9b270ef
Summary:
Sometimes, EdenFS goes bonkers and talks to 'hg debugedenimporthelper' a lot for seemingly no reason.
Make these situations easier to debug by counting how many requests EdenFS makes to 'hg debugedenimporthelper'. These counters let us answer questions such as the following:
* When performance sucks, is EdenFS is making a lot of requests or only a few requests? I.e. is Hg slow at responding or is EdenFS very demanding?
* Does a recent performance issue correlate with EdenFS communicating with 'hg debugedenimporthelper'?
* Which engineers are outliers having orders of magnitude more 'hg debugedenimporthelper' requests than p50 engineers?
We could get fancier with these counters and include the number of bytes received, the duration of the request, etc. For now, just having a request count is useful.
Reviewed By: simpkins
Differential Revision: D14677339
fbshipit-source-id: 7f8f394fb0096aef65d6a8a45d7da5936db539a0