Summary:
Backing stores differentiate between individual tree objects and the
root of a checkout. For example, Git and Mercurial roots are commit
hashes. Allow EdenFS to track variable-width roots to better support
arbitrary backing stores.
Reviewed By: genevievehelsel
Differential Revision: D28619584
fbshipit-source-id: d94f1ecd21a0c416c1b4933341c70deabf386496
Summary:
The meaning of the root ID is defined by the BackingStore, so move
parsing and rendering into the BackingStore interface.
Reviewed By: xavierd
Differential Revision: D28560426
fbshipit-source-id: 7cfed4870d48016811b604348742754f6cdbd842
Summary:
gtest includes some windows headers that will have conflicts with the
folly portability versions. This caused some issues in my in-memory tree
cache diffs (D27050310 (8a1a529fcc)).
We should probably generally be using the folly portable gtests so we can
avoid such issues in the future.
see here for more details: bd600cd4e8/folly/portability/GTest.h (L19)
I ran this with codemod yes to all
- convert all the includes with quotes:
`codemod -d eden/fs --extensions cpp,h '\#include\ "gtest/gtest\.h"' '#include <folly/portability/GTest.h>'`
- convert all the includes with brackets
`codemod -d eden/fs --extensions cpp,h '\#include\ <gtest/gtest\.h>' '#include <folly/portability/GTest.h>'`
- convert the test template
`codemod -d eden/facebook --extensions template '\#include\ <gtest/gtest\.h>' '#include <folly/portability/GTest.h>'`
then used `arc lint` to clean up all the targets files
Reviewed By: genevievehelsel, xavierd
Differential Revision: D28035146
fbshipit-source-id: c3b88df5d4e7cdf4d1e51d9689987ce039f47fde
Summary:
Xavier reported seeing a pile of redundant checkout events in his
journal, which caused Watchman to issue a series of no-op, sequential hg
status calls, which caused Buck to time out waiting for Watchman.
We're not sure what paths cause Mercurial to issue no-op
checkOutRevision Thrift calls, but we can certainly filter them out
inside of the journal.
Reviewed By: fanzeyi
Differential Revision: D26699828
fbshipit-source-id: f738b53bbafa4027dd70322d64413246bb5bb828
Summary:
Timeseries is memory intensive and not really required in the current context
it is being used.
Reviewed By: chadaustin
Differential Revision: D26315632
fbshipit-source-id: ee51c3ad8bef6fce152aa787c8c4602f0b499f92
Summary:
Accumulate every commit transition in the journal, rather than
treating every merged range of journal deltas as having a "from"
commit and a "to" commit. This makes it possible for Watchman to
accurately report the set of changed files across an ABA situation or
rapid `hg next` and `hg prev` operations.
Reviewed By: genevievehelsel
Differential Revision: D26193478
fbshipit-source-id: 8b54b9d5bcefa1811008a3b6e9c3aa25a69471ca
Summary:
The EdenFS codebase uses folly/logging/xlog to log, but we were still relying
on glog for the various CHECK macros. Since xlog also contains equivalent CHECK
macros, let's just rely on them instead.
This is mostly codemodded + arc lint + various fixes to get it compile.
Reviewed By: chadaustin
Differential Revision: D24871174
fbshipit-source-id: 4d2a691df235d6dbd0fbd8f7c19d5a956e86b31c
Summary:
The communication protocol between edenfs and watchman is that edenfs
advertises that "something in the mount has changed!" and it's up to
watchman to call getFilesChangedSince or getCurrentJournalPosition to
inspect what has changed.
This diff avoids redundantly notifying "something in the mount has
changed!" between the initial notification and actually reading the
journal contents. This way, a Watchman subscription cannot (*) fall
behind, and memory usage stays approximately constant during heavy
write traffic.
Hopefully, this will prevent the kernel from thinking the edenfs mount
is a slow IO device under high memory pressure.
* Technically, if there were a multitude of subscribeStreamTemporary
streams, and all but one was not calling getFilesChangedSince, it's
possible that notifications could back up on the stuck
subscriptions.
Reviewed By: wez
Differential Revision: D24090247
fbshipit-source-id: 6561c13e847b749c093adab75250df474d3210f9
Summary:
Before I make further changes to the Journal, improve the comments and
refactor a few small things.
Reviewed By: kmancini
Differential Revision: D24089530
fbshipit-source-id: de9da2c1e6b1c87b6587781cfa55ae7cc4085eeb
Summary:
We are unifying C++ APIs for accessing optional and unqualified fields:
https://fb.workplace.com/groups/1730279463893632/permalink/2541675446087359/.
This diff migrates code from accessing data members generated from unqualified
Thrift fields directly to the `field_ref` API, i.e. replacing
```
thrift_obj.field
```
with
```
*thrift_obj.field_ref()
```
The `_ref` suffixes will be removed in the future once data members are private
and names can be reclaimed.
The output of this codemod has been reviewed in D20039637.
The new API is documented in
https://our.intern.facebook.com/intern/wiki/Thrift/FieldAccess/.
drop-conflicts
Reviewed By: iahs
Differential Revision: D22764126
fbshipit-source-id: 67b1bc6d4a9135f594d78325cee8a194255bdcb8
Summary:
We are unifying C++ APIs for accessing optional and unqualified fields:
https://fb.workplace.com/groups/1730279463893632/permalink/2541675446087359/.
This diff migrates code from accessing data members generated from unqualified
Thrift fields directly to the `field_ref` API, i.e. replacing
```
thrift_obj.field
```
with
```
*thrift_obj.field_ref()
```
The `_ref` suffixes will be removed in the future once data members are private
and names can be reclaimed.
The output of this codemod has been reviewed in D20039637.
The new API is documented in
https://our.intern.facebook.com/intern/wiki/Thrift/FieldAccess/.
drop-conflicts
Reviewed By: yfeldblum
Differential Revision: D22631599
fbshipit-source-id: 9bfcaeb636f34a32fd871c7cd6a2db4a7ace30bf
Summary:
Thrift setter API is deprecated since it doesn't bring any value over direct assignment. Removing it can reduce build-time and make our codebase more consistent.
If result of `s.set_foo(bar)` is unused, this diff replaces
s.set_foo(bar);
with
s.foo_ref() = bar;
Otherwise, it replaces
s.set_foo(bar)
with
s.foo_ref().emplace(bar)
Reviewed By: chadaustin
Differential Revision: D21712029
fbshipit-source-id: 3a332b4bf6fac6b3cf396d34e6d5ca4849181a6d
Summary: All of these tests are passing on Windows with no changes.
Reviewed By: chadaustin
Differential Revision: D21341729
fbshipit-source-id: 2b4d52751e74fa953bfe5143dc0c5735de2d34cf
Summary: Tracing was not an accurate name for what this directory had become. So rename it to telemetry.
Reviewed By: wez
Differential Revision: D17923303
fbshipit-source-id: fca07e8447d9b9b3ea5d860809a2d377e3c4f9f2
Summary:
Update the CMakeLists.txt to also build the Python-based `edenfsctl` command
line tool.
This requires switching most of the thrift rules to generate both C++ and
Python sources.
Note that one missing feature at this point is that this does not package
external dependencies into the binary. Currently `edenfsctl` depends on both
`six` and `toml` as external dependencies. For now these must be available in
your `PYTHONPATH` in order to run the generated `edenfsctl` binary.
Reviewed By: chadaustin
Differential Revision: D17127615
fbshipit-source-id: fc138ab39e75c6a5bbd39e3f527d4e9f7f420e46
Summary:
This fixes a few issues with the library dependencies:
- The `eden_utils` library depends on `eden_service_thrift`, not
`eden_service`. By incorrectly depending on `eden_service` this introduced
a circular dependency which would cause a build failure, depending on which
order CMake chose to try and emit the link line.
- The `eden_config` library depends on code from `eden_model` (for `Hash` and
`ParentCommits`)
- The `eden_inodes` library depends on `eden_model_git` for the `GitIgnore`
logic. I also alphabetized the dependency list.
Reviewed By: wez
Differential Revision: D17124930
fbshipit-source-id: 70cbe81081fc1dc807cca13a93edc25ba270b01f
Summary:
The toHash field is unnecessary in HashUpdateDeltas since we only ever iterate in reverse. We instead keep track of the current hash in the journal itself instead of looking at the toHash of the latest HashUpdateDelta.
Delta Struct Sizes:
> File Change: 88
> Hash Update: 96
Reviewed By: strager
Differential Revision: D16522519
fbshipit-source-id: 43baccc8ef2579f72609cc84e81e218794b11725
Summary: Create the second delta type HashUpdateJournalDelta that keeps track of changes to the hash [and the unclean files associated with that hash change]. Journal methods were updated to account for the different delta types.
Reviewed By: strager
Differential Revision: D16520444
fbshipit-source-id: 2a5cea11c9e70e30f6db55d9c8e33f9322ae91fc
Summary: Add in two tests for deltas that change commit hashes since a test for this did not exist before. Previously a bug in setting the fromHash of a result only showed up as a failure in Watchman's integration tests with Eden.
Reviewed By: chadaustin
Differential Revision: D16528248
fbshipit-source-id: 56eede749ef2da4dc492a1f7376dc07ca8aa3050
Summary: Deltas are stores in a deque and therefore not malloced independently so therefore there is no reason to be getting the goodMallocSize of them. We also need to do accounting for the buckets in the deque that the deltas fall into.
Reviewed By: strager
Differential Revision: D16566675
fbshipit-source-id: 4506fbbcc2044b8fdfe6244313ef7480cfa5151e
Summary: Replace the uses of std::unique_ptr<JournalDelta> with just the JournalDelta itself to avoid an extra allocation.
Reviewed By: chadaustin, strager
Differential Revision: D16572089
fbshipit-source-id: be080b2fb9096f6c8783e2ecae21a99466336f6f
Summary:
Adds a debug command to eden such that users can flush the journal and cause the subscribers to get a truncated result, this will be useful for debugging [to debug we won't have to lower the memory limit and repeatedly touching files to fill up the journal].
Can be used with "eden debug flush_journal"
Reviewed By: chadaustin
Differential Revision: D16348811
fbshipit-source-id: fdbe6729d0393c424addcd42a091de440383035b
Summary: Refactoring the addDelta function to allow other Journal functions to add deltas while holding the deltaState lock. This refactoring will be used to create the flush function for the journal (which needs to hold the lock to empty the journal but wants to add a delta before releasing the lock).
Reviewed By: chadaustin
Differential Revision: D16341196
fbshipit-source-id: 7b7b1d933802b466efe624378206c72c71469129
Summary: Keep track of the longest query we had to resolve in accumulateRange. This stat will be helpful in determining how large the memory limit of the journal should be since we will know how far back people usually need to go.
Reviewed By: strager
Differential Revision: D16227920
fbshipit-source-id: a41c3b9f16b701cd8165e20409888983b8899dab
Summary: The journal will now keep of how many reads from accumulateRange have been truncated. This is a useful metric that will allow us to see how often we are forcing Watchman to use its fallback of creating a fresh instance and help us in calibrating if we find our memory limit is too small.
Reviewed By: strager
Differential Revision: D16017968
fbshipit-source-id: 95f4fbd1fd2d8523ff397202172408e1c89669be
Summary: D16096960 accidentally caused the truncation code to be run twice (once inline and once via a function code), this shouldn't cause any difference in outcome but is unnecessary.
Reviewed By: strager
Differential Revision: D16508798
fbshipit-source-id: 12781aee98e70e5105c5476d29cf5cdd1e31062d
Summary: Set up the infrastructure to add in Timeseries to the journal so that we can add in stats for the journal that relate to the whole process. For example, allow us to add in a truncatedRead TimeSeries easily as done in D16017968
Reviewed By: chadaustin
Differential Revision: D16461081
fbshipit-source-id: 964ff32e62aed0369da434793491b857c136b074
Summary:
To save on memory the journal will now compact the same action repeated multiple times into the same action. This means that modifying the same file 100 times in a row results in 1 Journal delta instead of 100. [The results will cause Watchman to act the same since all queries are down from the current time, changes should only be visible by the number of deltas in the journal, how much memory the deltas are using, "eden debug journal" which will show that sequenceID's were skipped, and the fromSequence/fromTime returned by accumulateRange might be different]
**Memory Improvements:**
For buck commands, 1 run was conducted for each with a buck clean done before each build and then eden being restarted (so the clean did not affect the outcome) [results are formatted as 'with compaction' / 'without compaction']
“buck build mode/opt eden”
Entries: 154145 / 206108 [25.2% reduction]
Memory: 46.2 MB / 61.4 MB [24.7% reduction]
“buck build mode/opt warm_storage/common/...”
Entries: 318820 / 405016 [21.3% reduction]
Memory: 95.8 MB / 121.5 MB [21.2% reduction]
For Nuclide the result was calculated by getting the number of entries in the journal vs the last sequence ID in the journal ('entries we actually have' / 'entries we would have without compaction')
Using Nuclide’s Smart Log and Checking Out various commits / arc pulling:
Entries: 6091 / 23671 [74.3% reduction]
Reviewed By: chadaustin
Differential Revision: D16096960
fbshipit-source-id: f542ae32c889ebc9da442285d808ce75247f7e65
Summary:
This diff updates Eden's journal to be bounded in terms of memory usage which should help lessen the likelihood of Eden OOMing and taking up a large amount of our users' resources.
The memory limit is set to be 1 GB per journal [so a user with 3 mounts could expect the journals to possibly use up to 3 GB of memory].
The landing of this diff will need to wait until a version of Watchman that can handle truncation is deployed on all machines using Eden. This means we need to wait for a version of Watchman with D16219267 to fully land to machines.
Reviewed By: strager
Differential Revision: D15954994
fbshipit-source-id: 9a6425527f10a1ce051feb8fc7d092a84712f338
Summary: Running `eden debug journal -f` will print and follow the eden journal in a similar style to the unix `tail -f` command.
Reviewed By: chadaustin
Differential Revision: D16112458
fbshipit-source-id: 5304cd0f857bdbeca41c2591e98920f4f1fc8f42
Summary: Sets up a thrift interface to set the size of the journal (until truncation is added in the size field in the journal currently does nothing other than being viewable from getMemoryLimit)
Reviewed By: chadaustin
Differential Revision: D16042286
fbshipit-source-id: bc0acdf4ac5516cfac66fa0fbd87254d08ad479b
Summary: Journal not specifying template argument for optional was causing eden builds to fail for old versions of C++. See D15978960.
Reviewed By: chadaustin
Differential Revision: D16059722
fbshipit-source-id: fb9903e304d9c5f79fc2d34f7763c3ada6ae4553
Summary: Shows the end-to-end duration of the journal in "eden stats"
Reviewed By: chadaustin
Differential Revision: D15993261
fbshipit-source-id: 46471faca17d4f12ccdd8cea55b2722e33519a74
Summary:
Moving from the previous linked list approach to having a deque in the Journal.
The memory used by the deque is not tracked currently, the difference might be negligible though. I ran a script that touched a file 62 million times and here is the result comparing rss_memory used in eden and the journal's estimated memory: https://fburl.com/ods/6sq8g5vc and it still seems to closely estimate the right amount.
Reviewed By: strager
Differential Revision: D15944614
fbshipit-source-id: 6a1ac34ecd80c0eecb80411984f88f62ae712e91
Summary: Change getLatest such that it just returns Info about the latest delta and not the delta itself.
Reviewed By: strager
Differential Revision: D15931214
fbshipit-source-id: c7a1cf4d62cdd4f9396fab46354eabcbb31f32c0
Summary: Combine the backend function that iterates over the deltas into one function so we do not need to modify the code in multiple places as we start to modify the internal structure (Additionally pulled the one use of the code outside of the journal into the journal)
Reviewed By: strager
Differential Revision: D15912907
fbshipit-source-id: 2f59ff9e7f33ffa5b420859153a609e68bda10b4
Summary: Adding a nullptr check for the case when the journal is empty and removing the use of default parameters.
Reviewed By: strager
Differential Revision: D15907329
fbshipit-source-id: 787b4a44f835fd8d128496ee6655e02987db98a7
Summary:
Update the copyright & license headers in CMake files to reflect the
relicensing to GPLv2
Reviewed By: wez
Differential Revision: D15487079
fbshipit-source-id: 715e559464c19a0070d6e55a095b3fc7d61ad2f8
Summary:
Update the copyright & license headers in C++ files to reflect the
relicensing to GPLv2
Reviewed By: wez
Differential Revision: D15487078
fbshipit-source-id: 19f24c933a64ecad0d3a692d0f8d2a38b4194b1d
Summary: Merge was a function on a JournalDelta that created a new JournalDelta and optionally left it connected to old JournalDeltas. AccumulateRange is a new function on the Journal itself (acting on the latest delta) that creates a JournalDeltaSum (which can't be connected to older Deltas)
Reviewed By: chadaustin
Differential Revision: D15881452
fbshipit-source-id: 573505c1171f78d46afc98f1db9b5b9ee2fff60f
Summary: Making addDelta private and giving users a more user-friendly way of appending entries to the journal.
Reviewed By: chadaustin, strager
Differential Revision: D15868089
fbshipit-source-id: 00c8a3066f0e4483e3c792651ade5f6a7ea05eed
Summary:
Some newer versions of `clang` (such as Apple's version 11) will warn/error out if a constructor or assignment operator
marked `default` is implicitly deleted (e.g., if the object contains a non-moveable/non-copyable member). This diff
removes all such defaulted constructors/assignment operators, which I ran into while building `edenfs` on my Macbook Pro.
Reviewed By: chadaustin, strager
Differential Revision: D15901794
fbshipit-source-id: 794ed8377693a6735bb567635dc919bc678751a4
Summary: JournalStats is currently O(# of deltas), updating it to be O(1)
Reviewed By: chadaustin
Differential Revision: D15718255
fbshipit-source-id: 1fb3f0b76d736bfa22195231c21d5f8b742fa1f7