Summary: This will help in abstracting the kind of store that is being written to.
Reviewed By: kulshrax
Differential Revision: D15284358
fbshipit-source-id: ab6a6d23978480ca65587b745ae39ac6ed98cca9
Summary:
This will allow to transparently use the IndexedLogDataStore or a datapack in
the edenapi code.
Reviewed By: kulshrax
Differential Revision: D15266194
fbshipit-source-id: 6396118a5c8107a8c91e5fc83fe4297d4321d10c
Summary:
This will be used to abstract writing to a MutableDataPack or
IndexedLogDataStore (or both).
Reviewed By: kulshrax
Differential Revision: D15266193
fbshipit-source-id: 99f2383555addbafea81a2752e8d6759a1c1c5e7
Summary:
When remotefilelog.fetchpacks is enabled, it's possible that 100 packfiles of
100MB each are present. In this case, every new packfiles that
hg_memcache_client would write will force an incremental repack, which will
only reduce the number of packfiles by a small number.
Let's have a simple heuristic that tries to bring the number of packfiles to be
lower than 50.
Reviewed By: DurhamG
Differential Revision: D15203771
fbshipit-source-id: 18c39487d5ac087d4879004993c1c1add087249c
Summary:
Instead of manually dropping some of the datapack/historypack fields, we can
drop the entire object. This allows implementing the Drop trait more easily.
But, this prevents the code from later using some of the object fields. We can
use replace to move them in a zero-copy fashion.
Reviewed By: DurhamG
Differential Revision: D15076017
fbshipit-source-id: 4831dfcc2005c957862d32eeda02f62796be3afb
Summary: Reading a comment is easier than trying to figure out the on-disk format.
Reviewed By: kulshrax
Differential Revision: D15056859
fbshipit-source-id: 097ed8bcaa51369aba4bcc9ed1cc95ebd6a67a66
Summary:
Compressing/Decompressing data can be expensive, so avoid doing it when not
needed. I though about using a RefCell but decided on just using mutable
reference as an Entry will always be private to indexedlogdatastore.rs.
Reviewed By: kulshrax
Differential Revision: D15056862
fbshipit-source-id: ac0b811f2df563be86e3ade9abe89476db5d13cc
Summary: This will allow decompression to be done on the fly as opposed to always.
Reviewed By: kulshrax
Differential Revision: D15056860
fbshipit-source-id: 60635c431579fc924a61d08b35688222ec4930bb
Summary:
Delta chains are only created during repack, as every download operation
fetches the full content of the file. Even if we wanted to support them,
interrupted chains adds undesirable complexity as it can lead to chain loops if
we're not careful. Let's just not support delta chains for now to avoid this.
Reviewed By: kulshrax
Differential Revision: D15056861
fbshipit-source-id: 4b0474ce134e946952a70f363190faf50850abe0
Summary:
I want to give Store a more specific name so that it doesn't get
confused with other Store abstractions that we will add in the
future.
Reviewed By: singhsrb
Differential Revision: D15007383
fbshipit-source-id: 499bcda4aecd5389e3bc1eba5206ba72a69c4c3d
Summary:
In the case where a delta chain is split between several logs, it's possible
that part of it may be removed due to some logs being removed. Instead of
treating this as an error, we can simply return the partial chain, the union
content store will simply continue the delta chain on the next store.
Reviewed By: quark-zju
Differential Revision: D14899943
fbshipit-source-id: 7369ee191dc4b35873344cd13c295c72472e0712
Summary:
The Python code interprets a KeyError as a lookup failure, and will retry the
lookup on the next store. Any other Rust errors will be translated into a
RuntimeError exception that Python will re-raise and stop the lookup.
Reviewed By: quark-zju
Differential Revision: D14895905
fbshipit-source-id: d22733c0a68ff3f28d502eb2cd4c3a0467ee35d1
Summary:
We should update the builder for Key to take a repo path. We could build
the key directly using the default struct constructor but representing
the two constructors as functions is more clear.
Reviewed By: quark-zju
Differential Revision: D14877543
fbshipit-source-id: 328906521cdbad535e28df22fea82f21e8b5410a
Summary:
It is fairly difficult to avoid an intermediary state where we don't have some
panics. Since we don't really deal with invalid paths this intermediary state
is not a real concern.
Reviewed By: quark-zju
Differential Revision: D14877553
fbshipit-source-id: 6f60f20af8d8f1e3ff23c5d8ab5353bc8d919ebf
Summary:
This function is difficult to justify in the context of the Rust borrow checker.
The primary concern for this pattern is preventing mutation when the object is
passed around.
We can always add the function back if it has to more than just return the
underlying value.
Reviewed By: quark-zju
Differential Revision: D14877545
fbshipit-source-id: acdd796e1bee5445c1bce5ce0ceb41a7334e4966
Summary:
While the Rust code can read/write content out of an indexedlog, the Python
code cannot. For now, all the writes will be done in Rust, and the Python code
will only be able to read from it.
Reviewed By: quark-zju
Differential Revision: D14894330
fbshipit-source-id: 5c1698d31412bc93e93dabb93be106a2ef17d184
Summary:
Packfiles are proving complex in several situation in order to perform well.
For instance, repack are required to keep common operation from spending most
of their time in scanning and iterating over the filesystem. In fact, most of
the pain point with packfiles is caused by their immutability: once written,
they can no longer be updated.
IndexedLog on the other hand can be updated in place, and therefore do no
require repacks and thus do not exhibit some of the pathological behavior that
packfile are showing.
As a first step, let's add a simple content store backed by indexedlog.
Reviewed By: quark-zju
Differential Revision: D14790070
fbshipit-source-id: 44f766db6a08169971f87a38246873c6e53c3233
Summary:
Zeyi realized that empty packfiles were problematic for the cdatapack code.
While its code should be fixed, having empty packfiles lying around is
unecessary anyway, so let's not write them.
Reviewed By: fanzeyi
Differential Revision: D14760942
fbshipit-source-id: a128eedaf79a6388a3c7142399715bb4eb96a2ae
Summary:
While datapack repack is fairly inexpensive in memory and mostly limited to the
number of entries in its index, a historypack repack needs to keep both the
data and the index in memory. It appears that the overhead of doing so is a big
factor in repack taking a lot of memory as a resulting 100MB histpack would use
about 1.2GB of RAM. Extrapolating the numbers, a resulting 4GB histpack would
need 48GB, which is enough to put a devserver in a swapping state, and worse
for laptops. Limiting the historypack size to 400MB should cap the RAM usage to
a bit under 5GB.
Reviewed By: kulshrax
Differential Revision: D14757839
fbshipit-source-id: b08bf01bddad01f1cae9cc67d4bd3d637c0bf0db
Summary: Add a function that adds all of the entries from a given iterator into a HistoryPack.
Differential Revision: D14647767
fbshipit-source-id: 29a71b37da86125e14135c40c279bfc8a454b568
Summary:
Now that mutablepacks can only create v1 packfile, we can force the Metadata to
not be optional. The main reason for doing this is to avoid issues where LFS
data is stored without its corresponding LFS flag. This can cause issue down
the line as LFS data will be intepreted as is, instead of being interpreted as
a pointer to the LFS blob.
Reviewed By: sfilipco
Differential Revision: D14443509
fbshipit-source-id: 9e7812017fc1356072278496406648f935024f92
Summary:
The v0 doesn't support flags like whether the data is actually an LFS pointer. Let's simply
forbid creating them.
Reviewed By: quark-zju
Differential Revision: D14443512
fbshipit-source-id: 6ffa2e8fda2b2baba0aae53e749bc9248594a134
Summary:
These last 2 errors are still considered fatal, but shouldn't be and are most
likely transient. Failing to open a packfile that was successfully opened
before can for instance happen when the file is removed by another process, or
if it somehow become corrupted. Failing the removal of the pack-file should no
longer be an issue, but if it fails, we can also ignore it with the reasoning
that the next repack will take care of it.
Reviewed By: sfilipco
Differential Revision: D14441288
fbshipit-source-id: 6c2758c2a88fd5d2d83b55defe3d263ee9f974a1
Summary: `LooseHistoryEntry` and `PackHistoryEntry` aren't the best names for these types, since the latter is what most users should use, whereas the former should only typically used for data transmission. As such, we should rename these to clarify the intent.
Differential Revision: D14512749
fbshipit-source-id: 5293df89766825077b2ba07224297b958bf46002
Summary:
Corrupted packfiles, or background removal of them could cause repack to fail,
let's simply ignore these transient errors and continue repacking.
Reviewed By: DurhamG
Differential Revision: D14373901
fbshipit-source-id: afe88e89a3bd0d010459975abecb2fef7f8dff6f
Summary: The historypack wasn't using remove_file from vfs which was causing repack to fail.
Reviewed By: sfilipco
Differential Revision: D14373649
fbshipit-source-id: 2d87f24bda541bc011ed38533db1ac7bdddc81e3
Summary: `AsRef<Path>` is more ergonomic than `&Path` since the former can accept `PathBuf`, `String`, etc.
Differential Revision: D14223167
fbshipit-source-id: 12d26adaa63855c339e04734c19d6697624f9c9e
Summary: Changing the permission on the packfile failed due to the file being opened.
Reviewed By: quark-zju
Differential Revision: D14174652
fbshipit-source-id: 356ac4748fd69e660a6cb9e63367a87489755e5e
Summary:
Rust tells us that Rng::choose and Rng::shuffle should be replaced by
SliceRandom::choose and SliceRandom::shuffle, so let's do it.
Reviewed By: singhsrb
Differential Revision: D14178565
fbshipit-source-id: 586eb2891f1c2cab0a3435c1b4ae8f870e7a3c25
Summary: Add a convenience function to `MutableHistoryPack` to add an entry from a `PackHistoryEntry` struct.
Differential Revision: D14162781
fbshipit-source-id: a0e07f34b9231011a339ce63adcef8ab55a0555e
Summary:
With the Store trait, we can de-duplicate code between the datapack repack, and
the historypack repack.
Reviewed By: quark-zju
Differential Revision: D14091894
fbshipit-source-id: 5bf335414df2420b42ec45cce7097f3a97a49796
Summary:
A lot of code is duplicated between data stores, and history stores, and one
reason for it is the absence of common trait between these 2. By adding a new
Store trait it will make it easier to write generic code that works accross
data and history store.
Reviewed By: quark-zju
Differential Revision: D14091899
fbshipit-source-id: deef1d43a7d300cb3607c67554ad54f20c870e23
Summary:
Instead of manually implementing DataStore/HistoryStore for Box, Rc, Arc, and
future smart pointers, we can simply implement the trait for all the types that
can be deref into a DataStore/HistoryStore.
Reviewed By: quark-zju
Differential Revision: D14078072
fbshipit-source-id: 47a80ab0179b84aa08836b6e7c5c3c5f9c1a08ff
Summary:
In order to move the types in `edenapi-types` (containing types shared between Mercurial and Mononoke) to the `types` crate, we need to move a few types from the `revisionstore` crate into this crate first, because `revisionstore` depends on `types`, which would create a circular dependency since `edenapi-types` uses types from `revisionstore`.
In particular, this diff moves the `Key` and `NodeInfo` types into their own modules in the `types` crate.
Reviewed By: quark-zju
Differential Revision: D14114166
fbshipit-source-id: 8f9e78d610425faec9dc89ecc9e450651d24177a
Summary:
During repack, the repacked files are deleted without any verification. Since
Adam saw some data loss, it's possible that somehow repack didn't fully repack
a packfile but it was deleted. Let's verify that the entire packfile was
repacked before deleting it.
Since repack is mostly a background operation, we don't have a way to notify
the user, but we can log the error to a scuba table to analyse further.
Reviewed By: DurhamG
Differential Revision: D14069766
fbshipit-source-id: 4358a87deeb9732eec1afdfb742e8d81db41cd87
Summary:
Removing files on Windows is hard. It can fail for many reasons, many of which
involves another process having the file opened in some way. One way to solve
this problem is that renaming the file isn't as restrictive as removing it.
Since hg repack will attempt removing any temporary files it will also try to
remove the packfiles that we failed to remove earlier.
Reviewed By: DurhamG
Differential Revision: D14030445
fbshipit-source-id: 1f3799e021c2e0451943a1d5bd4cd25ed608ffb6
Summary:
Packfiles are named based on their content, so having an on-disk file with the
same name means that they have the same content. If that happens, let's simply
continue without failing.
Reviewed By: DurhamG
Differential Revision: D14030446
fbshipit-source-id: f04c15507c89b2fca19c95a7b41d8e65c88da019