Commit Graph

18 Commits

Author SHA1 Message Date
Saurabh Singh
cfe084d02f telemetry: switch to using quantile stats instead of timeseries
Summary:
Timeseries is memory intensive and not really required in the current context
it is being used.

Reviewed By: chadaustin

Differential Revision: D26315632

fbshipit-source-id: ee51c3ad8bef6fce152aa787c8c4602f0b499f92
2021-02-14 16:37:08 -08:00
Saurabh Singh
0b9c2fb148 telemetry: switch to using quantile stats instead of histograms
Summary:
Histograms are memory intensive and require specifying fixed buckets for
instantiation.

Reviewed By: chadaustin

Differential Revision: D26315631

fbshipit-source-id: 6ce5459b8c1c6c25d84285baf96df55ce9119b1a
2021-02-14 16:37:08 -08:00
Chad Austin
acaba247cf implement FUSE_FALLOCATE
Summary:
Some linkers (lld being one) use fallocate() or posix_fallocate() on
the output file before writing its contents. EdenFS would return
ENOSYS or ENOTSUP so glibc would fall back and write a single byte to
every 512 byte block, which is terribly slow and generates a bunch of
fake traffic in the Watchman journal.

This diff implements basic support for FUSE_FALLOCATE, avoiding this
slow emulation.

Reviewed By: xavierd

Differential Revision: D25934694

fbshipit-source-id: c6c90ea2b517d4dbedce29d9a4340870c8c177c3
2021-01-20 19:04:59 -08:00
Andres Suarez
21c95391ca Apply clang-format update fixes
Reviewed By: igorsugak

Differential Revision: D25861960

fbshipit-source-id: e3c39c080429058a58cdc66d45350e5d1420f98c
2021-01-10 10:06:29 -08:00
Xavier Deguillard
754d132783 Back out "prjfs: handle concurrent file/directory removal"
Summary: Something is wrong with this which causes Unity to freak out.

Reviewed By: fanzeyi

Differential Revision: D25453230

fbshipit-source-id: 89f61fd97817403fa65071ddac022a226b775e53
2020-12-10 07:42:38 -08:00
Xavier Deguillard
2bfdf6f481 prjfs: handle concurrent file/directory removal
Summary:
A while back, we saw that concurrent directory creation would lead to EdenFS
being confused and failing to record some of the created directories. This then
caused EdenFS to no longer being in sync with what was on disk. To handle this
case, we've had to manually creating these directories recursively.

What I didn't realize at the time was that these concurrent notifications could
also happen on removal this time, and if a directory removal notification wins
the race against the removal of its last children, that directory wouldn't be
removed and EdenFS would once again be confused about the state of the
repository.

Fixing this is a bit trickier than directory creation as it's more racier.
Consider a directory that is being removed, and then immediately recreated with
a file in it in a different process. The naive approach of simply force
removing all of the children of a directory when handling the removal
notification would clash with the file creation. We could argue that nobody
should be doing this, but there would be an unhandled race, and thus a bug
where data would potentially be lost[0].

We can however fix this bug slightly differently. For file/directory removal,
we can actually hook onto the pre-callback, ie: one that happens before the
file/directory is no longer visible on disk. This inherently eliminate the race
altogether as the callback will be guaranteed to run when none of its children
are present, and if a race happens with a file creation in it, we can simply
fail the removal properly.

The only tricky bit is for the renaming logic, as renaming a file is logically
a removal followed by a creation. For that reason, I've moved part of the
renaming bits to the pre-callback too.

In theory, this change may negatively affect workloads that do concurrent
directory removal as the duration during which a file/directory is visible
ondisk now includes the EdenFS callback while it didn't before. Such workflows
should be fairly rare and/or redirected to avoid EdenFS altogether if
performance matters.

[0]: This left-over file that EdenFS wouldn't be aware of would also later
cause the checkout code to fail due to invalidation failures triggered when
trying to invalidate that directory. This would be fairly hard to debug.

Reviewed By: fanzeyi

Differential Revision: D25112381

fbshipit-source-id: 9300499ce872ad93d0a687f0e61b7e2a9caf9556
2020-12-04 14:25:44 -08:00
Xavier Deguillard
8b82dc96cb prjfs: make readdir asynchronous
Summary:
As of right now, opendir is the expensive callbacks due to fetching the sizes
for all the files in a directory. This strategy however breaks down when
timeouts are added as very large directories would trigger the timeout, not
because EdenFS is having a hard time reaching Mononoke, but because of
bandwidth limitation.

To avoid this issue, we need to have a per-file timeout and thus makes opendir
just trigger the futures, but not wait on them. The waiting bit will happen
at readdir time which will also enable having a timeout per file.

The one drawback to this is that the ObjectFetchContext that is passed in by
opendir cannot live long enough to be used in the size future. For now, we can
use a null context, but a proper context will need to be passed in, in the
future.

Reviewed By: wez

Differential Revision: D24895089

fbshipit-source-id: e10ceae2f7c49b4a006b15a34f85d06a2863ae3a
2020-11-13 14:27:26 -08:00
Xavier Deguillard
436a847a79 win: make read callback fully asynchronous
Summary:
Similarly to the other callbacks, this makes the main function return to
ProjectedFS as soon as the future is created which will allow for it to be
interrupted in a subsequent diff.

Reviewed By: fanzeyi

Differential Revision: D23745754

fbshipit-source-id: 2d77d0eacfe0d37eb9075bf9f0660e4f4af77e8f
2020-09-23 09:43:34 -07:00
Xavier Deguillard
49e13b3db6 win: make the queryFileName callback asynchronous
Summary: Similarly to the other one, this will make it possible to interrupt.

Reviewed By: fanzeyi

Differential Revision: D23643100

fbshipit-source-id: 0daab1cec94d0e177bb707d97bf928b05d5d24a3
2020-09-16 18:59:26 -07:00
Xavier Deguillard
5512d7c09f win: make the getPlaceholder callback asynchronous
Summary: Similarly to the other callback, this will make it possible to interrupt.

Reviewed By: fanzeyi

Differential Revision: D23643101

fbshipit-source-id: 9f9a48e752a850c63255b8867b980163cb6a92c9
2020-09-16 18:59:26 -07:00
Xavier Deguillard
ff16c758d2 win: make the opendir callback asynchronous
Summary:
The opendir callback tend to be the most expensive of all due to having to
fetch the content of all the files. This leads to some frustrating UX as the
`ls` operation cannot be interrupted. By making this asynchronous, the slow
operation can be interrupted. The future isn't cancelled and thus it will
continue to fetch in the background, this will be tackled in a future diff.

Reviewed By: fanzeyi

Differential Revision: D23630462

fbshipit-source-id: f1c4a9fbd9daa18ca4b8f4837c5241a37ccfbcf9
2020-09-16 18:59:25 -07:00
Xavier Deguillard
0904c05535 win: plumb PrjfsRequestContext in the notification callback
Summary:
Now that all the pieces are in place, we can plumb the request context in. For
now, this adds it to only one callback as I figure out more about it and tweak
it until I have something satisfactory. There are some rough edges with it that
I'm not entirely happy about, but as I change the notification callback to be
more async, I'm hoping to make more convenient to use and less clanky.

Reviewed By: fanzeyi

Differential Revision: D23505508

fbshipit-source-id: d5f12e22a8f67dfa061b8ad82ea718582c323b45
2020-09-16 18:59:24 -07:00
Xavier Deguillard
37df55b270 telemetry: consolidate Fuse/PrjFS stats in ChannelThreadStats
Summary:
This helps make RequestData slightly more generic by depending less on Fuse
specific types/code.

Reviewed By: chadaustin

Differential Revision: D23467487

fbshipit-source-id: 830f8269e2114c2968dcc49d3b5574e687191e4d
2020-09-02 15:28:39 -07:00
Xavier Deguillard
e5558221ac store: plumb CMD_CAT_TREE
Summary:
With Mercurial now supporting CMD_CAT_TREE for efficiently fetching and reading
trees, we can plumb this onto EdenFS. At startup time, we detect whether
Mercurial supports CMD_CAT_TREE and use that method, otherwise, we fallback to
the old CMD_FETCH_TREE.

Reviewed By: wez

Differential Revision: D23044953

fbshipit-source-id: 9aea5c5b82e97039a75ef18976a155dcb6e150bc
2020-08-12 08:17:25 -07:00
Xavier Deguillard
07df8faf5e win: when creating a file/directory, create the parents too
Summary:
As opposed to FUSE, ProjectedFS sends notifications for file/directory creation
after the fact, and for directory that means these will be visible on disk before
EdenFS may be aware of it. While EdenFS usually process it quickly, a heavily
multi-threaded application that tries to concurrently create a directory
hierarchy may end up sending notifications to EdenFS in a somewhat out of order
fashion.

Since this should be a very rare occurence, we make this a very slow path by
being optimistic and calling `getInode` first, and then only if that fails, we
aggressively create all the parent directories. During a buck build of ~1k
jobs, this happened only 3 times.

If we fully think this through, this change doesn't fully fix the race, as a
similar race can now happen when a create and remove/rename operations are
concurrent. However, a client performing these operations concurrently is
either aware that this is racy and should handle these properly, or is most
likely buggy. Both of these should significantly reduce the likelyhod of this
happening, thus, I'm leaving this unfixed for now.

To better understand how frequently this happens, I've added a stat counter.
For now, these aren't published to ODS, but this will be tackled later.

Reviewed By: wez

Differential Revision: D22783484

fbshipit-source-id: ea3aafc2f77b65d3967f697f68114921d5909137
2020-07-29 12:17:17 -07:00
Katie Mancini
3a035094f8 Record Mercurial tree import time
Summary: - added logging only around the import tree call to capture non-queue related wait time

Reviewed By: chadaustin, fanzeyi

Differential Revision: D20207472

fbshipit-source-id: d88bb34ce224a26ff2be100d7789ddeff608006d
2020-03-03 11:44:28 -08:00
Katie Mancini
52e211fe8e Record Mercurial file import time
Summary:
- added logging only around the import blob call to capture non-queue related wait time
- added to `test_reading_file_gets_file_from_hg` in `integration.stats_test.HgBackingStoreStatsTest`  to test import blob logging in addition to the get blob loging

(not yet done for importing trees, will do in next diff)

Reviewed By: chadaustin

Differential Revision: D20201215

fbshipit-source-id: c89281fe7d3d6e89d111ac8cce9014adff44ac40
2020-03-03 11:44:27 -08:00
Chad Austin
65c93484e2 rename tracing to telemetry
Summary: Tracing was not an accurate name for what this directory had become. So rename it to telemetry.

Reviewed By: wez

Differential Revision: D17923303

fbshipit-source-id: fca07e8447d9b9b3ea5d860809a2d377e3c4f9f2
2019-10-15 13:39:41 -07:00