Summary:
This is fairly mechanical diff that finalizes split of Hash into ObjectId and Hash20.
More specifically this diff does two things:
* Replaces `Hash` with `Hash20`
* Removes alias `using Hash = Hash20`
Reviewed By: chadaustin
Differential Revision: D31324202
fbshipit-source-id: 780b6d2a422ddf6d0f3cfc91e3e70ad10ebaa8b4
Summary:
The goal of this stack is to remove Proxy Hash type, but to achieve that we need first to address some tech debt in Eden codebase.
For the long time EdenFs had single Hash type that was used for many different use cases.
One of major uses for Hash type is identifies internal EdenFs objects such as blobs, trees, and others.
We seem to reach agreement that we need a different type for those identifiers, so we introduce separate ObjectId type in this diff to denote new identifier type and replace _some_ usage of Hash with ObjectId.
We still retain original Hash type for other use cases.
Roughly speaking, this is how this diff separates between Hash and ObjectId:
**ObjectId**:
* Everything that is stored in local store(blobs, trees, commits)
**Hash20**:
* Explicit hashes(Sha1 of the blob)
* Hg identifiers: manifest id and blob hg ig
For now, in this diff ObjectId has exactly same content as Hash, but this will change in the future diffs. Doing this way allows to keep diff size manageable, while migrating to new ObjectId right away would produce insanely large diff that would be both hard to make and review.
There are few more things that needs to be done before we can get to the meat of removing proxy hashes:
1) Replace include Hash.h with ObjectId.h where needed
2) Remove Hash type, explicitly rename rest of Hash usages to Hash20
3) Modify content of ObjectId to support new use cases
4) Modify serialized metadata and possibly other places that assume ObjectId size is fixed and equal to Hash20 size
Reviewed By: chadaustin
Differential Revision: D31316477
fbshipit-source-id: 0d5e4460a461bcaac6b9fd884517e129aeaf4baf
Summary:
VC++ 2019 is pickier about which standard library includes include
each other. Be explicit.
Reviewed By: zhengchaol
Differential Revision: D31186916
fbshipit-source-id: 95cfa8848d0e2e312e2024923fa166db5f68dde0
Summary:
folly:format is deprecated in lieu of fmt and std::format. Migrate
most of EdenFS to fmt instead.
Differential Revision: D31025948
fbshipit-source-id: 82ed674d5e255ac129995b56bc8b9731a5fbf82e
Summary:
To eliminate the need for proxy hashes, we need variable-width object
IDs. Introduce an ObjectId type much like RootId.
Reviewed By: genevievehelsel
Differential Revision: D30819412
fbshipit-source-id: 07a185ba6b866b475c92f811e70aa00a8a9f895f
Summary:
In preparation for expanding to variable-width hashes, rename the
existing hash type to Hash20.
Reviewed By: genevievehelsel
Differential Revision: D28967365
fbshipit-source-id: 8ca8c39bf03bd97475628545c74cebf0deb8e62f
Summary:
Looking at strobelight when performing an `eden prefetch` shows that a lot of
time is spent copying data around. The list of hash to prefetch is for instance
copied 4 times, let's reduce this to only one time when converting Hash to a
ByteRange.
Reviewed By: chadaustin
Differential Revision: D30433285
fbshipit-source-id: 922e6e5c095bd700ee133e9bb219904baf2ae1ac
Summary:
Backing stores differentiate between individual tree objects and the
root of a checkout. For example, Git and Mercurial roots are commit
hashes. Allow EdenFS to track variable-width roots to better support
arbitrary backing stores.
Reviewed By: genevievehelsel
Differential Revision: D28619584
fbshipit-source-id: d94f1ecd21a0c416c1b4933341c70deabf386496
Summary:
The meaning of the root ID is defined by the BackingStore, so move
parsing and rendering into the BackingStore interface.
Reviewed By: xavierd
Differential Revision: D28560426
fbshipit-source-id: 7cfed4870d48016811b604348742754f6cdbd842
Summary:
EdenFS goes out of its way to track the second working copy parent,
but it never uses it. Stop writing it to the SNAPSHOT file.
Reviewed By: genevievehelsel
Differential Revision: D28453213
fbshipit-source-id: d7d36a1c67553f92234bec911051f4f1d4ef1d4a
Summary:
gtest includes some windows headers that will have conflicts with the
folly portability versions. This caused some issues in my in-memory tree
cache diffs (D27050310 (8a1a529fcc)).
We should probably generally be using the folly portable gtests so we can
avoid such issues in the future.
see here for more details: bd600cd4e8/folly/portability/GTest.h (L19)
I ran this with codemod yes to all
- convert all the includes with quotes:
`codemod -d eden/fs --extensions cpp,h '\#include\ "gtest/gtest\.h"' '#include <folly/portability/GTest.h>'`
- convert all the includes with brackets
`codemod -d eden/fs --extensions cpp,h '\#include\ <gtest/gtest\.h>' '#include <folly/portability/GTest.h>'`
- convert the test template
`codemod -d eden/facebook --extensions template '\#include\ <gtest/gtest\.h>' '#include <folly/portability/GTest.h>'`
then used `arc lint` to clean up all the targets files
Reviewed By: genevievehelsel, xavierd
Differential Revision: D28035146
fbshipit-source-id: c3b88df5d4e7cdf4d1e51d9689987ce039f47fde
Summary:
This introduces some basic unit tests to ensure correctness of the cache.
We are adding tests to cover the simple methods of the object cache since we
are using that code path here. And adding a few sanity check tests to make sure
the cache works with trees.
Reviewed By: chadaustin
Differential Revision: D27050296
fbshipit-source-id: b5f0577c1662483f732bb962c5b40bca8e1dcb40
Summary:
Chad first noted that deserializing trees from the local store can be expensive.
From the thrift side EdenFS does not have a copy of trees in memory. This
means for glob files each of the trees that have not been materialized will be
read from the local store. Since reading an deserializing trees from the local
store can be expensive lets add an in memory cache so that some of these
reads can be satisfied from here instead.
This introduces the class for the in memory cache and is based on the existing
BlobCache. note that we keep the minimum number of entries functionality from
the blob cache. This is unlikely to be needed as trees are much less likely
than blobs to exceed a reasonable cache size limit, but kept since we already
have it.
Reviewed By: chadaustin
Differential Revision: D27050285
fbshipit-source-id: 9dd46419761d32387b6f55ff508b60105edae3af
Summary:
We would like to use a limited size LRU cache fore trees as well as blobs,
so I am templatizing this to allow us to use this cache for trees.
Trees will not need to use Interest handles, but in the future we could use
this cache for blob metadata, which might want to use interest handles.
Additionally if we at somepoint change the inode tree scheme that would remove
the tree content from the inodes itself, interest handle might be useful for
trees. We could also use this cache proxy hashes which may or may not use
interest handles. Since some caches may want interest handles and others will
not I am creating get/insert functions that work with and without interest
handles.
Reviewed By: chadaustin
Differential Revision: D27797025
fbshipit-source-id: 6db3e6ade56a9f65f851c01eeea5de734371d8f0
Summary:
It's always annoyed me that HgProxyHash has a constructor which knows
how to load itself from a LocalStore. Add an explicit load() function,
and clean up some other stuff about the class while I'm in there.
Reviewed By: xavierd
Differential Revision: D26769231
fbshipit-source-id: f0ea9f16c3f1fbcd3d4361bcc34845901094b282
Summary:
The world has moved on utf-8 as the default encoding for files and data, but
EdenFS still accepts non utf-8 filenames to be written to it. In fact, most of
the time when a non utf-8 file is written to the working copy, and even though
EdenFS handles it properly, Mercurial ends up freaking out and crash. In all of
these cases, non-utf8 files were not intentional, and thus refusing to create
them wouldn't be a loss of functionality.
Note that this diff makes the asumption that Mercurial's manifest only accept
utf8 path, and thus we only have to protect against files being created in the
working copy that aren't utf8.
The unfortunate part of this diff is that it makes importing trees a bit more
expensive as testing that a path is utf8 valid is not free.
Reviewed By: chadaustin
Differential Revision: D25442975
fbshipit-source-id: 89341a004272736a61639751da43c2e9c673d5b3
Summary:
The StringPiece constructor is untyped, and was only used in test. We can
afford to build the PathComponent in tests instead to avoid future headaches.
Reviewed By: genevievehelsel
Differential Revision: D25434556
fbshipit-source-id: 4b10bf2576870e81412d76c4b9755b45e26986b3
Summary:
Mercurial support files with `\` in their name, which can't be represented on
Windows due to `\` being the path separator. Currently, EdenFS will throw
errors at the user when such file are encountered, let's simply warn, and
continue.
Reviewed By: chadaustin
Differential Revision: D25430523
fbshipit-source-id: 4167b4cd81380226aead8e4f4850a7738087fd95
Summary:
The code still took a dependency on Mercurial's old manifest code to parse
manifests. It turns out the manifests have a very simple format that we could
parse directly.
This avoids various copies, conversions, std::list, removes ~1k lines of code,
at the expense of adding ~100 lines of code (some of them being C++
boilerplate).
Reviewed By: fanzeyi
Differential Revision: D25385018
fbshipit-source-id: 90d4cda2b7797584bc48c086d5592a7ecaa05dfc
Summary:
The EdenFS codebase uses folly/logging/xlog to log, but we were still relying
on glog for the various CHECK macros. Since xlog also contains equivalent CHECK
macros, let's just rely on them instead.
This is mostly codemodded + arc lint + various fixes to get it compile.
Reviewed By: chadaustin
Differential Revision: D24871174
fbshipit-source-id: 4d2a691df235d6dbd0fbd8f7c19d5a956e86b31c
Summary:
Previously, when that code was ported on Windows, paths separator were
converted from '\' to '/' when a wide string was provided, all the other paths
were treated as is.
The main issue with this strategy is that not all paths can be converted, the
non-stored ones for instance are immutable, which leads to some subtle bugs
down the line. For instance, the paths: "Z:/foo/bar/baz" and "Z:\foo/bar\baz"
would not be equal as the path separator isn't the same, but both of these are
actually the same path underneath.
To solve this, this diff first introduce a Windows path separator, and then
modifies the path comparison functions to ignore the path separator and only
compare the components.
I'm definitively not a fan of the pattern I use for searching for both / and \
in paths, suggestions are welcome for how to improve that.
Reviewed By: chadaustin
Differential Revision: D24376980
fbshipit-source-id: 0702bf775c7c3937b2138abd5a63d339ac80aaed
Summary:
Thrift represents `binary` data type as `std::string` in C++. This method will
help us to convert `Hash` into a byte string.
Reviewed By: xavierd
Differential Revision: D24083621
fbshipit-source-id: ae50088db7727d98ca11a017f82b71e942217a17
Summary: This will make it easier to build with Buck.
Reviewed By: fanzeyi
Differential Revision: D23827754
fbshipit-source-id: bf3bf4d607a08b9831f9dfea172b2e923a219561
Summary:
While this isn't the right fix, this is what shipped in our packages, for the
sake of being able to reproduce the package, let's land this as it is. A
future change will remove this ifdef.
Below is pkaush original description:
In Eden Windows we treat all the files as regular files and don't have a
concept of symlinks and executable files. Fixing the TreeEntryType::getType()
to return REGULAR_FILE for executable file and symlink.
Reviewed By: wez
Differential Revision: D20481051
fbshipit-source-id: 0b0c4d7aea28134383ef45aeafc02930b420286b
Summary: All the tests are passing.
Reviewed By: chadaustin
Differential Revision: D21341730
fbshipit-source-id: 90a3872b190879ec163935ff53703157028f87bc
Summary:
The modeFromEntryType and treeEntryTypeFromMode tests for symlinks and
executable had to be disabled as these function explicitely do not support
these. Since mode bits are a bit meaningless on Windows, this is probably OK.
Reviewed By: chadaustin
Differential Revision: D21341728
fbshipit-source-id: 86acf24d9ab67a02ecab33b7ebe82a456295fc3c
Summary:
Google Benchmark is easier to use, has more built-in functionality,
and more accurate default behavior than Folly Benchmark, so switch
EdenFS to use it.;
Reviewed By: simpkins
Differential Revision: D20273672
fbshipit-source-id: c90c49878592620a83d2821ed4bc75c20e599a75
Summary:
This enables globFiles for Windows, with some
minor tweaks around dtype to enable the build and make
the results consistent between watchman and eden.
Reviewed By: chadaustin
Differential Revision: D20536715
fbshipit-source-id: b1c8184dc664910e4d052a21b4cd993ddfaadf25
Summary:
Eden on Windows doesn't support setting a file as executable or creating symlinks.
Windows doesn't need executable mode bit to execute. It can execute the files with executable extension, or the responsible program can run it like Python3.exe can run python script.
Reviewed By: chadaustin
Differential Revision: D19956268
fbshipit-source-id: c22416db2a9da78e3a5c4392d1537eb7cbf9bfd0
Summary:
In dev mode, the glob benchmark failed inside of
folly::Range::operator[] because asserting null termination
technically violates the bounds check.
Reviewed By: simpkins
Differential Revision: D20268416
fbshipit-source-id: ee9b16a6eb9882e850631aa9d83fffe7b6fb67c3
Summary:
Looking at a log, it wasn't immediately obvious what might have passed
an invalid hash into the Hash constructor. Improve the error message
to make the cause clearer.
Reviewed By: genevievehelsel
Differential Revision: D18380916
fbshipit-source-id: 620b8fa902a87496b87a5aa0ff304e6991585864
Summary: Removes `TreeDiffer` class and passes `DiffContext` through standalone `TreeDiffer` functions as first argument as per comment on D17400466 for setup for processing gitignores in the `TreeDiffer` codepath. (also this allows for easy implementation of short circut of `future_getScmStatusBetweenRevisions` similar to D17531102)
Reviewed By: chadaustin
Differential Revision: D17717977
fbshipit-source-id: d480d212474bd80aeac9cd9bb901f97562b62b13
Summary:
Update the copyright & license headers in CMake files to reflect the
relicensing to GPLv2
Reviewed By: wez
Differential Revision: D15487079
fbshipit-source-id: 715e559464c19a0070d6e55a095b3fc7d61ad2f8
Summary:
Update the copyright & license headers in C++ files to reflect the
relicensing to GPLv2
Reviewed By: wez
Differential Revision: D15487078
fbshipit-source-id: 19f24c933a64ecad0d3a692d0f8d2a38b4194b1d
Summary: getEntryPtr() does a case sensitive lookup because of which few Ovrsource builds were failing. Ovrsource code is including header files with the wrong case.
Reviewed By: strager
Differential Revision: D15344850
fbshipit-source-id: 3d5d658a49cdafc07dc9a18a2f3d2073306e8f40
Summary:
D8559702 changed `folly::IOBuf::computeChainDataLength()` to return a `size_t`
Update our format specifier to match to avoid compiler warnings on Mac.
Reviewed By: chadaustin
Differential Revision: D14878220
fbshipit-source-id: 19e96bea07c57bb542a848b3688d65143db51d13
Summary:
The issue is that the compiler needs an `else` to see
that we can only reach the throw if none of the other paths are
taken; with that satisfied it believes that we are legitimately
constexpr.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67371
Reviewed By: chadaustin
Differential Revision: D14638234
fbshipit-source-id: f9524d2816580f41842a40e30118b03998c3660a
Summary:
This diff adds the dtype field to the glob results;
this will help to reduce the cost of some watchman queries by avoiding a
getFileInformation call that instantiates inodes.
As part of this, I added a bunch of unit test coverage.
Reviewed By: strager
Differential Revision: D8779149
fbshipit-source-id: 3064a3e42be55ec576fed9e0f7112edef426f32d