Commit Graph

189 Commits

Author SHA1 Message Date
Xavier Deguillard
2ddabd35f9 revisionstore: add MutableHistoryStore
Summary: This will help in abstracting the kind of store that is being written to.

Reviewed By: kulshrax

Differential Revision: D15284358

fbshipit-source-id: ab6a6d23978480ca65587b745ae39ac6ed98cca9
2019-05-09 18:33:49 -07:00
Xavier Deguillard
96c66954e5 revisionstore: implement MutableDeltaStore for MutableDataPack and IndexedLogDataStore
Summary:
This will allow to transparently use the IndexedLogDataStore or a datapack in
the edenapi code.

Reviewed By: kulshrax

Differential Revision: D15266194

fbshipit-source-id: 6396118a5c8107a8c91e5fc83fe4297d4321d10c
2019-05-09 18:33:48 -07:00
Xavier Deguillard
053e854327 revisionstore: add MutableDeltaStore
Summary:
This will be used to abstract writing to a MutableDataPack or
IndexedLogDataStore (or both).

Reviewed By: kulshrax

Differential Revision: D15266193

fbshipit-source-id: 99f2383555addbafea81a2752e8d6759a1c1c5e7
2019-05-09 18:33:48 -07:00
Xavier Deguillard
f94c1f0c69 revisionstore: repack packfiles to get under 50 packfiles.
Summary:
When remotefilelog.fetchpacks is enabled, it's possible that 100 packfiles of
100MB each are present. In this case, every new packfiles that
hg_memcache_client would write will force an incremental repack, which will
only reduce the number of packfiles by a small number.

Let's have a simple heuristic that tries to bring the number of packfiles to be
lower than 50.

Reviewed By: DurhamG

Differential Revision: D15203771

fbshipit-source-id: 18c39487d5ac087d4879004993c1c1add087249c
2019-05-06 12:07:31 -07:00
Xavier Deguillard
2f58502ba8 revisionstore: use replace instead of direct drop
Summary:
Instead of manually dropping some of the datapack/historypack fields, we can
drop the entire object. This allows implementing the Drop trait more easily.
But, this prevents the code from later using some of the object fields. We can
use replace to move them in a zero-copy fashion.

Reviewed By: DurhamG

Differential Revision: D15076017

fbshipit-source-id: 4831dfcc2005c957862d32eeda02f62796be3afb
2019-04-25 18:53:06 -07:00
Xavier Deguillard
621d5f637c revisionstore: describe the serialization format
Summary: Reading a comment is easier than trying to figure out the on-disk format.

Reviewed By: kulshrax

Differential Revision: D15056859

fbshipit-source-id: 097ed8bcaa51369aba4bcc9ed1cc95ebd6a67a66
2019-04-24 10:58:55 -07:00
Xavier Deguillard
ffba172165 revisionstore: compress/decompress when needed
Summary:
Compressing/Decompressing data can be expensive, so avoid doing it when not
needed. I though about using a RefCell but decided on just using mutable
reference as an Entry will always be private to indexedlogdatastore.rs.

Reviewed By: kulshrax

Differential Revision: D15056862

fbshipit-source-id: ac0b811f2df563be86e3ade9abe89476db5d13cc
2019-04-24 10:58:55 -07:00
Xavier Deguillard
5b2bdfb23d revisionstore: make indexedlogdatastore::Entry fields private
Summary: This will allow decompression to be done on the fly as opposed to always.

Reviewed By: kulshrax

Differential Revision: D15056860

fbshipit-source-id: 60635c431579fc924a61d08b35688222ec4930bb
2019-04-24 10:58:55 -07:00
Xavier Deguillard
dc612855be revisionstore: don't support delta chains in the indexedlog datastore
Summary:
Delta chains are only created during repack, as every download operation
fetches the full content of the file. Even if we wanted to support them,
interrupted chains adds undesirable complexity as it can lead to chain loops if
we're not careful. Let's just not support delta chains for now to avoid this.

Reviewed By: kulshrax

Differential Revision: D15056861

fbshipit-source-id: 4b0474ce134e946952a70f363190faf50850abe0
2019-04-24 10:58:55 -07:00
Stefan Filip
a802e610d1 revisionstore: rename Store to LocalStore
Summary:
I want to give Store a more specific name so that it doesn't get
confused with other Store abstractions that we will add in the
future.

Reviewed By: singhsrb

Differential Revision: D15007383

fbshipit-source-id: 499bcda4aecd5389e3bc1eba5206ba72a69c4c3d
2019-04-19 09:51:29 -07:00
Stefan Filip
4b0e94305f revisionstore: use RepoPath in DataEntry
Summary: migration

Differential Revision: D14945337

fbshipit-source-id: 96247d27bc9e829a1ebb73c5617a399e149ac69b
2019-04-16 15:34:30 -07:00
Xavier Deguillard
67a4b6af52 revisionstore: partial chains can be returned from get_delta_chain
Summary:
In the case where a delta chain is split between several logs, it's possible
that part of it may be removed due to some logs being removed. Instead of
treating this as an error, we can simply return the partial chain, the union
content store will simply continue the delta chain on the next store.

Reviewed By: quark-zju

Differential Revision: D14899943

fbshipit-source-id: 7369ee191dc4b35873344cd13c295c72472e0712
2019-04-16 10:47:09 -07:00
Xavier Deguillard
c358df34a8 revisionstore: a not found key should fail with KeyError
Summary:
The Python code interprets a KeyError as a lookup failure, and will retry the
lookup on the next store. Any other Rust errors will be translated into a
RuntimeError exception that Python will re-raise and stop the lookup.

Reviewed By: quark-zju

Differential Revision: D14895905

fbshipit-source-id: d22733c0a68ff3f28d502eb2cd4c3a0467ee35d1
2019-04-16 10:47:09 -07:00
Stefan Filip
d45e21573a revisionstore: use RepoPath in HistoryPackIterator
Summary: migration

Reviewed By: quark-zju

Differential Revision: D14908310

fbshipit-source-id: 76623300c04bd8643796a99f66d9d3144787f072
2019-04-15 10:01:52 -07:00
Stefan Filip
be522e8dc5 revisionstore: remove uses Key::from_name_slice
Summary: migration

Reviewed By: quark-zju

Differential Revision: D14908315

fbshipit-source-id: 5d7d11982b70d10b49bb7fcd12cc6bf9c98146d6
2019-04-15 10:01:52 -07:00
Stefan Filip
77cdaca742 types: remove uses of Key::from_name_slice
Summary: migrating

Reviewed By: quark-zju

Differential Revision: D14908314

fbshipit-source-id: 92d9092bd879858349ab3b8cb98a484451c0442b
2019-04-15 10:01:52 -07:00
Stefan Filip
ee7703e821 revisionstore: use RepoPath in HistoryEntry
Summary: migrating

Reviewed By: quark-zju

Differential Revision: D14884957

fbshipit-source-id: 865f970627c08a26d1336fa57235f8ebbdb1d4a9
2019-04-15 10:01:51 -07:00
Stefan Filip
dd4a010aac revisionstore: use RepoPath in HistoryPack
Summary: Migrating

Reviewed By: quark-zju

Differential Revision: D14884958

fbshipit-source-id: 34bf2ea726b19f9929652d9836a224baac8b328b
2019-04-15 10:01:51 -07:00
Stefan Filip
7e2b3c256f types: rename Key::new to Key::from_name_slice
Summary:
We should update the builder for Key to take a repo path. We could build
the key directly using the default struct constructor but representing
the two constructors as functions is more clear.

Reviewed By: quark-zju

Differential Revision: D14877543

fbshipit-source-id: 328906521cdbad535e28df22fea82f21e8b5410a
2019-04-14 19:56:50 -07:00
Stefan Filip
4d59694b10 types: change the underlying type for Key::path to RepoPath
Summary:
It is fairly difficult to avoid an intermediary state where we don't have some
panics. Since we don't really deal with invalid paths this intermediary state
is not a real concern.

Reviewed By: quark-zju

Differential Revision: D14877553

fbshipit-source-id: 6f60f20af8d8f1e3ff23c5d8ab5353bc8d919ebf
2019-04-14 19:56:49 -07:00
Stefan Filip
967cd9c01b revisionstore: use testutil in indexdlogdatastore
Summary: testutil everywhere

Reviewed By: quark-zju

Differential Revision: D14884959

fbshipit-source-id: 7f999179866e4d71f0e89bd00df168e5932818f2
2019-04-14 19:56:49 -07:00
Stefan Filip
885c477d3b revisionstore: update mutabledatapack to use testutil
Summary: testutil everywhere

Reviewed By: quark-zju

Differential Revision: D14877550

fbshipit-source-id: 3aa7a345adaac3444ce73ae6c20326bbcef9e873
2019-04-14 19:56:48 -07:00
Stefan Filip
91245a8749 revisionstore: use testutil in datapack
Summary: testutil everywhere

Reviewed By: quark-zju

Differential Revision: D14877542

fbshipit-source-id: f4bd4bf97206d2a2f5deb4d28d22f9dd7bec5a72
2019-04-14 19:56:48 -07:00
Stefan Filip
78d11002eb types: remove Key::node()
Summary:
This function is difficult to justify in the context of the Rust borrow checker.
The primary concern for this pattern is preventing mutation when the object is
passed around.

We can always add the function back if it has to more than just return the
underlying value.

Reviewed By: quark-zju

Differential Revision: D14877545

fbshipit-source-id: acdd796e1bee5445c1bce5ce0ceb41a7334e4966
2019-04-14 19:56:47 -07:00
Stefan Filip
a420476a20 revisionstore: migrate repacks.rs to use testutil
Summary: testutil everywhere

Reviewed By: quark-zju

Differential Revision: D14877549

fbshipit-source-id: 9df8e76068e68eff2895a6454dff13b21f2894ac
2019-04-14 19:56:46 -07:00
Stefan Filip
abbb2a5f7a revisionstore: use testutil in ancestors
Summary: testutil everywhere

Reviewed By: quark-zju

Differential Revision: D14877546

fbshipit-source-id: cb1534cf925a633370dd60a3191855d14bbaa84e
2019-04-14 19:56:46 -07:00
Xavier Deguillard
5ddf39f788 remotefilelog: add an indexedlog contentstore
Summary:
While the Rust code can read/write content out of an indexedlog, the Python
code cannot. For now, all the writes will be done in Rust, and the Python code
will only be able to read from it.

Reviewed By: quark-zju

Differential Revision: D14894330

fbshipit-source-id: 5c1698d31412bc93e93dabb93be106a2ef17d184
2019-04-11 12:07:58 -07:00
Xavier Deguillard
a38ac46869 revisionstore: add an indexedlog backed content store
Summary:
Packfiles are proving complex in several situation in order to perform well.
For instance, repack are required to keep common operation from spending most
of their time in scanning and iterating over the filesystem. In fact, most of
the pain point with packfiles is caused by their immutability: once written,
they can no longer be updated.

IndexedLog on the other hand can be updated in place, and therefore do no
require repacks and thus do not exhibit some of the pathological behavior that
packfile are showing.

As a first step, let's add a simple content store backed by indexedlog.

Reviewed By: quark-zju

Differential Revision: D14790070

fbshipit-source-id: 44f766db6a08169971f87a38246873c6e53c3233
2019-04-10 10:34:34 -07:00
Stefan Filip
f958833800 asyncpacks: update asyncdatapack.rs to testutil
Summary: testutil everywhere

Differential Revision: D14713053

fbshipit-source-id: 26fcdea580dd45280bf2f1725dcdb6ab8948465f
2019-04-08 16:21:08 -07:00
Xavier Deguillard
e106c73ddf revisionstore: do not create an empty datapack/historypack
Summary:
Zeyi realized that empty packfiles were problematic for the cdatapack code.
While its code should be fixed, having empty packfiles lying around is
unecessary anyway, so let's not write them.

Reviewed By: fanzeyi

Differential Revision: D14760942

fbshipit-source-id: a128eedaf79a6388a3c7142399715bb4eb96a2ae
2019-04-03 20:43:06 -07:00
Xavier Deguillard
39e66964f4 revisionstore: limit history repack memory usage
Summary:
While datapack repack is fairly inexpensive in memory and mostly limited to the
number of entries in its index, a historypack repack needs to keep both the
data and the index in memory. It appears that the overhead of doing so is a big
factor in repack taking a lot of memory as a resulting 100MB histpack would use
about 1.2GB of RAM. Extrapolating the numbers, a resulting 4GB histpack would
need 48GB, which is enough to put a devserver in a swapping state, and worse
for laptops. Limiting the historypack size to 400MB should cap the RAM usage to
a bit under 5GB.

Reviewed By: kulshrax

Differential Revision: D14757839

fbshipit-source-id: b08bf01bddad01f1cae9cc67d4bd3d637c0bf0db
2019-04-03 16:56:09 -07:00
Arun Kulshreshtha
7036097135 revisionstore: add convenience function to add many entries to a historypack
Summary: Add a function that adds all of the entries from a given iterator into a HistoryPack.

Differential Revision: D14647767

fbshipit-source-id: 29a71b37da86125e14135c40c279bfc8a454b568
2019-03-27 12:38:39 -07:00
Stefan Filip
ca6052e70c revisionstore: update rand package
Summary: This fixes the build in test mode.

Differential Revision: D14533840

fbshipit-source-id: baa40261f17cdc8881d99a52a7f5cbd1ff66307a
2019-03-20 19:56:14 -07:00
Xavier Deguillard
1c1b1fadc7 revisionstore: make the Metadata mandatory when adding data to a datapack
Summary:
Now that mutablepacks can only create v1 packfile, we can force the Metadata to
not be optional. The main reason for doing this is to avoid issues where LFS
data is stored without its corresponding LFS flag. This can cause issue down
the line as LFS data will be intepreted as is, instead of being interpreted as
a pointer to the LFS blob.

Reviewed By: sfilipco

Differential Revision: D14443509

fbshipit-source-id: 9e7812017fc1356072278496406648f935024f92
2019-03-19 16:24:50 -07:00
Xavier Deguillard
10373e38e2 revisionstore: Force mutabledatapack to be created with v1
Summary:
The v0 doesn't support flags like whether the data is actually an LFS pointer. Let's simply
forbid creating them.

Reviewed By: quark-zju

Differential Revision: D14443512

fbshipit-source-id: 6ffa2e8fda2b2baba0aae53e749bc9248594a134
2019-03-19 16:24:50 -07:00
Xavier Deguillard
a3cee67af5 revisionstore: ignore more errors in repack_packs
Summary:
These last 2 errors are still considered fatal, but shouldn't be and are most
likely transient. Failing to open a packfile that was successfully opened
before can for instance happen when the file is removed by another process, or
if it somehow become corrupted. Failing the removal of the pack-file should no
longer be an issue, but if it fails, we can also ignore it with the reasoning
that the next repack will take care of it.

Reviewed By: sfilipco

Differential Revision: D14441288

fbshipit-source-id: 6c2758c2a88fd5d2d83b55defe3d263ee9f974a1
2019-03-19 16:19:14 -07:00
Arun Kulshreshtha
ef3f3dea44 types: rename LooseHistoryEntry and PackHistoryEntry
Summary: `LooseHistoryEntry` and `PackHistoryEntry` aren't the best names for these types, since the latter is what most users should use, whereas the former should only typically used for data transmission. As such, we should rename these to clarify the intent.

Differential Revision: D14512749

fbshipit-source-id: 5293df89766825077b2ba07224297b958bf46002
2019-03-18 19:50:19 -07:00
Xavier Deguillard
41d275ad36 revisionstore: ignore transient errors during repack
Summary:
Corrupted packfiles, or background removal of them could cause repack to fail,
let's simply ignore these transient errors and continue repacking.

Reviewed By: DurhamG

Differential Revision: D14373901

fbshipit-source-id: afe88e89a3bd0d010459975abecb2fef7f8dff6f
2019-03-11 18:15:45 -07:00
Xavier Deguillard
f868d77cd1 revisionstore: use remove_file from vfs.rs
Summary: The historypack wasn't using remove_file from vfs which was causing repack to fail.

Reviewed By: sfilipco

Differential Revision: D14373649

fbshipit-source-id: 2d87f24bda541bc011ed38533db1ac7bdddc81e3
2019-03-07 15:24:10 -08:00
Arun Kulshreshtha
a430c04f81 revisionstore: use AsRef<Path> in constructors
Summary: `AsRef<Path>` is more ergonomic than `&Path` since the former can accept `PathBuf`, `String`, etc.

Differential Revision: D14223167

fbshipit-source-id: 12d26adaa63855c339e04734c19d6697624f9c9e
2019-02-27 12:43:43 -08:00
Xavier Deguillard
ee8c0812fd revisionstore: fix test on windows
Summary: Changing the permission on the packfile failed due to the file being opened.

Reviewed By: quark-zju

Differential Revision: D14174652

fbshipit-source-id: 356ac4748fd69e660a6cb9e63367a87489755e5e
2019-02-22 10:22:30 -08:00
Xavier Deguillard
7c34139c06 revisionstore: fix compilation warnings
Summary:
Rust tells us that Rng::choose and Rng::shuffle should be replaced by
SliceRandom::choose and SliceRandom::shuffle, so let's do it.

Reviewed By: singhsrb

Differential Revision: D14178565

fbshipit-source-id: 586eb2891f1c2cab0a3435c1b4ae8f870e7a3c25
2019-02-21 18:39:21 -08:00
Arun Kulshreshtha
942d9d984a revisionstore: allow adding a PackHistoryEntry to a MutableHistoryPack
Summary: Add a convenience function to `MutableHistoryPack` to add an entry from a `PackHistoryEntry` struct.

Differential Revision: D14162781

fbshipit-source-id: a0e07f34b9231011a339ce63adcef8ab55a0555e
2019-02-21 14:39:04 -08:00
Xavier Deguillard
1d5283d1da revisionstore: refactor repack_datapacks/repack_historypacks
Summary:
With the Store trait, we can de-duplicate code between the datapack repack, and
the historypack repack.

Reviewed By: quark-zju

Differential Revision: D14091894

fbshipit-source-id: 5bf335414df2420b42ec45cce7097f3a97a49796
2019-02-20 09:40:56 -08:00
Xavier Deguillard
126de4655e revisionstore: add a Store trait
Summary:
A lot of code is duplicated between data stores, and history stores, and one
reason for it is the absence of common trait between these 2. By adding a new
Store trait it will make it easier to write generic code that works accross
data and history store.

Reviewed By: quark-zju

Differential Revision: D14091899

fbshipit-source-id: deef1d43a7d300cb3607c67554ad54f20c870e23
2019-02-19 12:18:27 -08:00
Xavier Deguillard
8e14fcb123 revisionstore: implement DataStore/HistoryStore for Deref types
Summary:
Instead of manually implementing DataStore/HistoryStore for Box, Rc, Arc, and
future smart pointers, we can simply implement the trait for all the types that
can be deref into a DataStore/HistoryStore.

Reviewed By: quark-zju

Differential Revision: D14078072

fbshipit-source-id: 47a80ab0179b84aa08836b6e7c5c3c5f9c1a08ff
2019-02-19 12:18:27 -08:00
Arun Kulshreshtha
9c6b914a22 types: move Key and NodeInfo out of revisionstore
Summary:
In order to move the types in `edenapi-types` (containing types shared between Mercurial and Mononoke) to the `types` crate, we need to move a few types from the  `revisionstore` crate into this crate first, because `revisionstore` depends on `types`, which would create a circular dependency since `edenapi-types` uses types from `revisionstore`.

In particular, this diff moves the `Key` and `NodeInfo` types into their own modules in the `types` crate.

Reviewed By: quark-zju

Differential Revision: D14114166

fbshipit-source-id: 8f9e78d610425faec9dc89ecc9e450651d24177a
2019-02-15 22:51:04 -08:00
Xavier Deguillard
76316fbf9d revisionstore: verify repacked keys before deleting pack files
Summary:
During repack, the repacked files are deleted without any verification. Since
Adam saw some data loss, it's possible that somehow repack didn't fully repack
a packfile but it was deleted. Let's verify that the entire packfile was
repacked before deleting it.

Since repack is mostly a background operation, we don't have a way to notify
the user, but we can log the error to a scuba table to analyse further.

Reviewed By: DurhamG

Differential Revision: D14069766

fbshipit-source-id: 4358a87deeb9732eec1afdfb742e8d81db41cd87
2019-02-14 13:03:09 -08:00
Xavier Deguillard
e5a7da32da revisionstore: rename the packfile before removal on windows
Summary:
Removing files on Windows is hard. It can fail for many reasons, many of which
involves another process having the file opened in some way. One way to solve
this problem is that renaming the file isn't as restrictive as removing it.

Since hg repack will attempt removing any temporary files it will also try to
remove the packfiles that we failed to remove earlier.

Reviewed By: DurhamG

Differential Revision: D14030445

fbshipit-source-id: 1f3799e021c2e0451943a1d5bd4cd25ed608ffb6
2019-02-14 10:34:52 -08:00
Xavier Deguillard
8c40ed3a71 revisionstore: ignore AlreadyExists errors when persisting a mutable pack
Summary:
Packfiles are named based on their content, so having an on-disk file with the
same name means that they have the same content. If that happens, let's simply
continue without failing.

Reviewed By: DurhamG

Differential Revision: D14030446

fbshipit-source-id: f04c15507c89b2fca19c95a7b41d8e65c88da019
2019-02-14 10:34:52 -08:00