sapling/eden/fs/store/TreeMetadata.h
Andrey Chursin 0af2511a3f separate out ObjectId [proxy hash removal 1/n]
Summary:
The goal of this stack is to remove Proxy Hash type, but to achieve that we need first to address some tech debt in Eden codebase.

For the long time EdenFs had single Hash type that was used for many different use cases.

One of major uses for Hash type is identifies internal EdenFs objects such as blobs, trees, and others.

We seem to reach agreement that we need a different type for those identifiers, so we introduce separate ObjectId type in this diff to denote new identifier type and replace _some_ usage of Hash with ObjectId.

We still retain original Hash type for other use cases.

Roughly speaking, this is how this diff separates between Hash and ObjectId:

**ObjectId**:
* Everything that is stored in local store(blobs, trees, commits)

**Hash20**:
* Explicit hashes(Sha1 of the blob)
* Hg identifiers: manifest id and blob hg ig

For now, in this diff ObjectId has exactly same content as Hash, but this will change in the future diffs. Doing this way allows to keep diff size manageable, while migrating to new ObjectId right away would produce insanely large diff that would be both hard to make and review.

There are few more things that needs to be done before we can get to the meat of removing proxy hashes:

1) Replace include Hash.h with ObjectId.h where needed
2) Remove Hash type, explicitly rename rest of Hash usages to Hash20
3) Modify content of ObjectId to support new use cases
4) Modify serialized metadata and possibly other places that assume ObjectId size is fixed and equal to Hash20 size

Reviewed By: chadaustin

Differential Revision: D31316477

fbshipit-source-id: 0d5e4460a461bcaac6b9fd884517e129aeaf4baf
2021-10-01 10:25:46 -07:00

81 lines
2.3 KiB
C++

/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* This software may be used and distributed according to the terms of the
* GNU General Public License version 2.
*/
#pragma once
#include <optional>
#include <variant>
#include <vector>
#include <folly/io/IOBuf.h>
#include "eden/fs/model/Hash.h"
#include "eden/fs/store/SerializedBlobMetadata.h"
namespace facebook::eden {
class BlobMetadata;
class StoreResult;
/**
* This is to help manipulate and store the metadata for the blob entries
* a tree. Currently "metadata" means the size and the SHA-1 hash of a Blob's
* contents.
*/
class TreeMetadata {
public:
/** Used to prepare the tree metadata for storage and when tree metadata is
* read out of the local store. --
* Storing tree metadata indexed by hashes instead of names removes the
* complexity of storing variable length names. It also allows us to easily
* Store BlobMetadata from stored TreeMetadata since BlobMetadata is stored
* under the eden hash for a blob.
*/
using HashIndexedEntryMetadata =
std::vector<std::pair<ObjectId, BlobMetadata>>;
/** Used when TreeMetdata was just fethed from the server --
* the server is unaware of the eden specific hashes we use in eden, so
* tree metdata from the server will use names to index the metdata for the
* entries in the tree.
*/
using NameIndexedEntryMetadata =
std::vector<std::pair<std::string, BlobMetadata>>;
using EntryMetadata =
std::variant<HashIndexedEntryMetadata, NameIndexedEntryMetadata>;
explicit TreeMetadata(EntryMetadata entryMetadata);
/**
* Serializes the metadata for all of the blob entries in the tree.
*
* note: hashes of each of the entries are used in serialization, so each of
* the EntryIdentifier for the entries must contain the hash of the entry
* before calling this method. Otherwise this raises a std::domain_error.
*/
folly::IOBuf serialize() const;
static TreeMetadata deserialize(const StoreResult& result);
const EntryMetadata& entries() const {
return entryMetadata_;
}
private:
size_t getSerializedSize() const;
size_t getNumberOfEntries() const;
static constexpr size_t ENTRY_SIZE =
ObjectId::RAW_SIZE + SerializedBlobMetadata::SIZE;
EntryMetadata entryMetadata_;
};
} // namespace facebook::eden