sapling/eden/fs/model
Andrey Chursin 08f337f7ab embed proxy hashes into object id [proxy hash removal 7/n]
Summary:
This diff introduces config store:embed-proxy-hashes.

When this config is set, we store HgId directly into ObjectId, instead of using proxy hash object.
This allows to bypass proxy hash rocks db storage when reading files.

**Compatibility notes**

This diff is compatible with previous versions unless store:embed-proxy-hashes config is set.

Once config is set, new ObjectId format is used and serialized into inodes. Once this is done previous versions of eden fs won't be able to read overlay inodes created by this version.

This means we need to be careful with setting this config - once set we won't be able to roll back eden fs version easily, it will basically require re-creating eden checkout.

Inodes created prior to this config being set will remain written in old format, only when new inode is written is when new format is used.

**Git tree format issue**

We use git tree serialization format in the LocalStore to serialize trees.
This format assumes 20-byte hashes and is not compatible with variable length ObjectId.

In this diff we bypass this issue by not storing trees into local store. This seem ok in terms of correctness, because tree information can always be fetched from mercurial.

However, this seem to impose performance penalty on some work loads (see below).

We can solve this by either introducing new format that supports var length object id(short term), or by getting rid of tree cache and efficiently getting the data directly from mercurial(long term).

**Performance numbers**

Hot file access time is reduced by 50%:
```
$ fsprobe.sh run cat.targets

Before:
lat: 0.2331 ms, qps: 4, dur: 28.697384178s, 123092 files, 217882490 bytes, 1641 errors, rate 7.59 Mb/s

After:
lat: 0.1611 ms, qps: 6, dur: 19.835917353s, 123092 files, 217882490 bytes, 1641 errors, rate 10.98 Mb/s
```

However, we do not see improvement with arc focus, most likely due to bypassing tree serialization, so we will need to figure out that issue.

We can still merge this diff and see if enabling this feature on other workloads like sandcastle is benefitical.

Reviewed By: chadaustin

Differential Revision: D31777929

fbshipit-source-id: fc4b678477d0737c9f242968f0be99ed04f4f58a
2021-11-05 17:05:43 -07:00
..
git Remove ObjectId(hex) constructor [proxy hash removal 6/n] 2021-10-25 20:06:30 -07:00
test Make ObjectId a variable length hash [proxy hash removal 4/n] 2021-10-22 17:52:01 -07:00
Blob.h separate out ObjectId [proxy hash removal 1/n] 2021-10-01 10:25:46 -07:00
CMakeLists.txt model: add tests to CMake 2020-05-07 10:07:32 -07:00
Hash.cpp explicit Hash20 instead of Hash [proxy hash removal 2/n] 2021-10-01 12:43:26 -07:00
Hash.h explicit Hash20 instead of Hash [proxy hash removal 2/n] 2021-10-01 12:43:26 -07:00
ObjectId.cpp Make ObjectId a variable length hash [proxy hash removal 4/n] 2021-10-22 17:52:01 -07:00
ObjectId.h embed proxy hashes into object id [proxy hash removal 7/n] 2021-11-05 17:05:43 -07:00
RootId.cpp add missing headers 2021-09-27 17:01:18 -07:00
RootId.h introduce a variable-width RootId type that identifies the root of an EdenFS checkout's contents 2021-06-07 17:25:31 -07:00
Tree.cpp model: namespace facebook::eden 2021-06-08 19:29:37 -07:00
Tree.h embed proxy hashes into object id [proxy hash removal 7/n] 2021-11-05 17:05:43 -07:00
TreeEntry.cpp rename ObjectId::toString to toLogString 2021-10-19 18:58:52 -07:00
TreeEntry.h explicit Hash20 instead of Hash [proxy hash removal 2/n] 2021-10-01 12:43:26 -07:00