Summary:
The Rust code is almost at parity with the Python code, let's expose it to
Python.
Reviewed By: quark-zju
Differential Revision: D16794076
fbshipit-source-id: faf1da775b4e57328be62a06d0065c7becf1b9f4
Summary:
In some rare situation, the packs directory may not exist, the PackStore
shouldn't fail in these case, it should just act as if nothing is present in
it.
Reviewed By: quark-zju
Differential Revision: D16794079
fbshipit-source-id: e33bd55e0f378b9be58831a85ec822221af7a9bc
Summary:
The LruStore keeps the packfiles in a somewhat ordered manner to reduce the
cost of finding an object in the stores. For now, its implementation is very
basic as it just moves the store where data was found at the front.
Reviewed By: quark-zju
Differential Revision: D16746800
fbshipit-source-id: 67375e6ab8a4d9e54da9a9bb4af5d95061446e6f
Summary: This just iterates over all the keys contained in the store.
Reviewed By: quark-zju
Differential Revision: D17104467
fbshipit-source-id: 34eee949cf4a1504df67c09fab7f71ef95210eed
Summary:
We've seen a handful of cases where the indexedlogdatastore becomes corrupted
which makes Mercurial unable to run properly. For now, and since we only use
the indexedlog for the hgcache, let's just remove it.
A better solution would be to harden the indexedlog code to better detect the
corruption and attempt to fix them.
Reviewed By: quark-zju
Differential Revision: D17115622
fbshipit-source-id: ee474a5df60c4414f6ea21ace7dff0f7048879c9
Summary:
The indexedlog based datastore is being rolled out more broadly, let's add a
basic historystore indexedlog to replace the historypacks. One of its first use
will be in hg_memcache_client write path to remove some pathological cases
where hg_memcache_client can write thousands of packfiles. This in turn will
remove the need to run repack to keep the amount of packfiles in control.
The IndexedLog key is the concatenation of sha1(path) and the node. Hashing the
path should be fairly cheap and makes it easy to integrate with the IndexedLog.
One of the drawback versus the histpack will be storage space usage, as the
path is always stored per entry, while it was shared with multiple entries in
the history pack. I though about having a separate index to easily translate
the hashed path to the path, but due to the potential log rotation, we could
end-up in a case where the path isn't present at all in the store.
Reviewed By: quark-zju
Differential Revision: D16616082
fbshipit-source-id: 1e47260b479f8923cc137a39dcba54b2d074f43a
Summary: New degbugstore command prints contents of blob in store give filenname and hash.
Reviewed By: xavierd
Differential Revision: D16791780
fbshipit-source-id: d4529f3f368677b4f65a5772f82a1655552fefa5
Summary:
This updates Mononoke to support LFS metadata when serving data over getpackv2.
However, in doing so, I've also refactored the various ways in which we currently access file data to serve it to clients or to process client uploads (when we need to compute deltas). The motivation to do that is that we've had several issues recently where some protocols knew about some functionality, and others didn't. Notably, redaction and LFS were supported in getfiles, but neither of them were supported in getpack or eden_get_data.
This patch refactors all those callsites away from blobrepo and instead through repo_client/remotefilelog, which provides an internal common method to fetch a filenode and return its metadata and bytes (prepare_blob), and separate protocol specific implementations for getpackv1 (includes metadata + file content -- this is basically the existing fetch_raw_filenode_bytes function), getpackv2 (includes metadata + file contents + getpackv2 metadata), getfiles (includes just file content, and ties file history into its response) and eden_get_data (which uses getpackv1).
Here are a few notable changes here that are worth noting as you review this:
- The getfiles method used to get its filenode from get_maybe_draft_filenode, but all it needed was the copy info. However, the updated method gets its filenode from the envelope (which also has this data). This should be equivalent.
- I haven't been able to remove fetch_raw_filenode_bytes yet because there's a callsite that still uses it and it's not entirely clear to me whether this is used and why. I'll look into it, but for now I left it unchanged.
- I've used the Mercurial implementation of getpack metadata here. This feels like the better approach so we can reuse some of the code, but historically I don't think we've depended on many Mercurial crates. Let me know if there's a reason not to do that.
Finally, there are a couple things to be aware of as you review:
- I removed some more `Arc<BlobRepo>` in places where it made it more difficult to call the new remotefilelog methods.
- I updated the implementation to get copy metadata out of a file envelope to not require copying the metadata into a mercurial::file::File only to immediately discard it.
- I cleaned up an LFS integration test a little bit. There are few functional changes there, but it makes tests a little easier to work with.
Reviewed By: farnz
Differential Revision: D16784413
fbshipit-source-id: 5c045d001472fb338a009044ede1e22ccd34dc55
Summary: We never used this, and loosefiles are pretty much gone, let's remove the code.
Reviewed By: quark-zju
Differential Revision: D16726160
fbshipit-source-id: f810356a0a9cd980d8e5306bc967c3f25475afa6
Summary:
The PackStore is one of the last missing piece to be able to have all the code
that deals with data on disk in Rust. There are a couple of features that the
Python code has that isn't present here:
1) LRU of packfiles
2) Detection of corrupted packfiles
3) Automatic repacking
While 1) and 2) will be added later, I'm not planning on working on 3). The
only reason we still have 3) is due to hg_memcache_client potentially writing
histpack one by one. I'm planning on adding a very basic IndexedLog based
history store and have hg_memcache_client write to it. This will greatly reduce
the number of histpack created and thus remove the need for the repacks.
Reviewed By: sfilipco
Differential Revision: D16606429
fbshipit-source-id: 06bcdad6b79de4cd6284c685cccf6fdf7ab3f8a7
Summary:
The Rust code didn't have a nice description of the file format, let's copy it
from the Python code.
Reviewed By: kulshrax
Differential Revision: D16423135
fbshipit-source-id: beeae2857fa5e999a3b41e72726b29b865a6a2b4
Summary:
The code would wrongly fail if only part of a chain is present in the store.
Let's properly succeed in this case.
Reviewed By: kulshrax
Differential Revision: D16392288
fbshipit-source-id: addaf340f947e48a2747bbe240d6aae081ae165c
Summary:
We've had issues in the perforce and svn converter where a node's parent was
itself, causing mayhem. Durham fixed this (D16189602) in the python code, but
since the python code is going away very soon, let's have this check in the
Rust code too.
Reviewed By: quark-zju
Differential Revision: D16346625
fbshipit-source-id: 54cec11a7822510ede91824eaf8663a9c1b7a0aa
Summary:
This diff sets two Rust lints to warn in fbcode:
```
[rust]
warn_lints = bare_trait_objects, ellipsis_inclusive_range_patterns
```
and fixes occurrences of those warnings within common/rust, hg, and mononoke.
Both of these lints are set to warn by default starting with rustc 1.37. Enabling them early avoids writing even more new code that needs to be fixed when we pull in 1.37 in six weeks.
Upstream tracking issue: https://github.com/rust-lang/rust/issues/54910
Reviewed By: Imxset21
Differential Revision: D16200291
fbshipit-source-id: aca11a7a944e9fa95f94e226b52f6f053b97ec74
Summary: It will be useful when implementing debugging features for the store.
Reviewed By: kulshrax
Differential Revision: D16063177
fbshipit-source-id: acfd8cbe697e947ac4395a38377172eeaf1aa230
Summary:
This is needed to switch `hg debugdatapack` to the Rust datapack. One major
difference with the Python code is that the Rust code builds a Vec instead of
an iterator. Implementing iterators with cpython-rust is hard (is it actually
possible?), but since the stored content isn't returned the Vec shouldn't use
too much memory.
For now, the implemention is tied to a packfile, but it shouldn't be too hard
to implement for the IndexedLogDataStore too.
Reviewed By: kulshrax
Differential Revision: D16039097
fbshipit-source-id: 567809785263bbf8cd3e1a9c24ecbb989b5c2496
Summary:
We've recently had issues with our Python implementation of MutableDataPack,
and careful audit of this code from DurhamG revealed that the Rust code had a
small issue too.
If the read_entry function fails in between the 2 seeks, the file position
won't be at the end of the file when add is called, potentially corrupting the
content of the file.
The previous solution chosen just involves panicing if the SeekFrom::End fails,
as there is no real good ways of recovering from it failing, unless we want to
add lots of complexity to the code.
This diff changes it to use O_APPEND at open time so the kernel would always
seek to the end of the file on write(2).
Reviewed By: quark-zju
Differential Revision: D16037868
fbshipit-source-id: 60ffcd3eb256b3854fb2e816cf1f289a1ef92ef6
Summary:
Similarly to MutableDataStore, all MutableHistoryStore can provide a read
access to them, so let's enforce this at compile time.
The PythonMutableHistoryPack has empty implementations as the type is only
going to be used for writing and is present only as a temporary solution until
Rust mutable history store are used exclusively.
Reviewed By: kulshrax
Differential Revision: D15717840
fbshipit-source-id: b2b3242aba548f72fbe218738dc77c49b24813f5
Summary:
Trait objects aren't sized, and therefore these blanket implementation weren't
available for trait objects. Since there is no requirement for a Sized type,
let's remove it.
Reviewed By: kulshrax
Differential Revision: D15717836
fbshipit-source-id: 9248f9bee7b3917d21be5105aa1848b6dd177007
Summary: The trait object is already boxed, no need to box it a second time.
Reviewed By: quark-zju
Differential Revision: D15546379
fbshipit-source-id: a491ca6dd70f21d6e967ae9e08802c6ebe9960fc
Summary: This is required to be used outside of revisionstore.
Reviewed By: quark-zju
Differential Revision: D15546382
fbshipit-source-id: 84c321dd0d200e8fa7e97ebbc195be0a941cbe68
Summary:
The multiplex stores are otherwise not Send, which is required for its use in
the python bindings.
Reviewed By: quark-zju
Differential Revision: D15546383
fbshipit-source-id: da6f6fdf3fb8b8d9682727d69e08d30afa729299
Summary:
When failing to remove a file, make sure that we add the '.' in between
the file stem and its extension. This is required in order to have future
repacks remove these temporary files.
Differential Revision: D15558277
fbshipit-source-id: f43fca8a4aa4663345fadd57c83f46cea7e74981
Summary:
The KeyError is a special case as it's recognized by Python and allow the
union stores to keep searching. Any other errors and hg crashes, so let's make
sure the mutable stores properly return a KeyError when a key isn't found.
Reviewed By: kulshrax
Differential Revision: D15504681
fbshipit-source-id: 2a989cb0b5c82d9fd481c5ff7c7815122602aab0
Summary:
This is an optimization that is present in the Python code, and switching to
the Rust code yield different packfile hashes due to it. Let's add the
optimization for now.
Reviewed By: quark-zju
Differential Revision: D15481171
fbshipit-source-id: 144a8611e13935a278936aef4b5e5bb958d2532d
Summary:
Trait objects aren't sized, but we do want to make sure that when wrapped into
a deref type (such as Box), we do have the blanket implementation.
Reviewed By: kulshrax
Differential Revision: D15474318
fbshipit-source-id: 3777a9de487bce226b7f5e40767bd2d8e4f63daa
Summary:
In all cases, the mutable stores also support reading from them. Let's have the
compiler enforce this, and implement the missing traits.
Reviewed By: kulshrax
Differential Revision: D15474315
fbshipit-source-id: dfcd93f319582ff3cd54aeabaa66bc6df9ce63a8
Summary:
Now that flush is implemented everywhere, we can replace the use of close with
flush.
Reviewed By: kulshrax
Differential Revision: D15416717
fbshipit-source-id: 5aea730b435e3c2073619ba676e60134f59f87c9
Summary:
All the stores are implementing it, we can now remove the default
unimplemented.
Reviewed By: kulshrax
Differential Revision: D15416710
fbshipit-source-id: 84f6bb7c0cf7db4161ca61a55d15691a00a5b18f
Summary:
Now that MutableDeltaStore::flush is implemented everywhere, let's remove the
close method and replace it with flush where necessary.
Reviewed By: kulshrax
Differential Revision: D15416716
fbshipit-source-id: e66dad66a3aff25e80efb10dc2e22c9878336699
Summary: The method is now implemented in all stores, we can get rid of it.
Reviewed By: kulshrax
Differential Revision: D15416715
fbshipit-source-id: 6d6e6f5bfaf66796f66efbdc943e125b56808e39
Summary:
As we're migrating toward using flush instead of close, the MultiplexStore no
longer requires a "&mut T", we can directly get the ownership of the object and
flush can be properly implemented.
Reviewed By: kulshrax
Differential Revision: D15416704
fbshipit-source-id: 3cf6c9bf829297e07853484913e6260331ab2b3b
Summary:
Similarly to MutableDataPack, this split MutableHistoryPack and implement the
flush method on the outer struct. The flush merely close the inner and then
re-open a new one.
Reviewed By: kulshrax
Differential Revision: D15416707
fbshipit-source-id: a81f9103de53a6d9fdabc88b6a1e00d04e39f72e
Summary:
Similarly to the flush method for MutableDeltaStore, this will allow a
MutableHistoryStore to be object safe.
Reviewed By: kulshrax
Differential Revision: D15416713
fbshipit-source-id: 888fb6154fa4bde125c5a9b1a7be6b11279a4f6f
Summary:
The flush implementation just closes the current packfile, and re-opens a new
one. This satisfies the property that writes are possible after a flush.
Reviewed By: kulshrax
Differential Revision: D15416705
fbshipit-source-id: 2024e8ae714c18ead5e020fb681736013ccca4b2
Summary:
Since flush requires the store to be kept open after it returns, we need to be
able to re-create a MutableDataPack. For this, let's use the inner pattern to
keep the current pending datapack.
Reviewed By: kulshrax
Differential Revision: D15416714
fbshipit-source-id: 4470e50e982681c8e8fadd920d58027132fb3420
Summary: The name was suggested by DurhamG. It's a nice idea. Therefore the rename.
Reviewed By: singhsrb, xavierd
Differential Revision: D15420361
fbshipit-source-id: 1f591c2ecd8082607c3b6fd34c83068e5f555c99
Summary: The flush method is now in the MutableDeltaStore, so move it there.
Reviewed By: kulshrax
Differential Revision: D15416711
fbshipit-source-id: 55eb411e4e4cf98c51813ef29364dcba74dc7f66
Summary:
A MutableDeltaStore isn't object safe due to the close method. This means that
we can't use them in generic context and the code has to work around this
limitation.
Let's replace the close method by a flush one, that no longer takes the
ownership of self. The behavior of flush is to simply make the previous writes
visible, while still allowing future writes to the same object.
Reviewed By: kulshrax
Differential Revision: D15416709
fbshipit-source-id: 11c96d672171eaac91add3d7bcad7f8eae70de16
Summary: This new store duplicates all the writes to all the stores that it's made of.
Reviewed By: kulshrax
Differential Revision: D15337751
fbshipit-source-id: f9458f7e76b8893b302cdd5dec1409c439a236c9
Summary:
For now, only pack-file based stores implement MutableHistoryStore, but once
the trait is implemented for stores that are updated in place, returning a Path
on close will not make much sense.
Reviewed By: kulshrax
Differential Revision: D15285970
fbshipit-source-id: 011db2b60c11c1eebfe11881cfc5ebafa1676704
Summary:
While it makes sense that closing a datapack returns the path, it doesn't
really mean anything for an update in place store. Let's change the API to
return an Option<PathBuf>.
Reviewed By: kulshrax
Differential Revision: D15285969
fbshipit-source-id: 804acd75607e86a0bc875910f6aaa300a5526558
Summary:
With the MutableDeltaStore and MutableHistoryStore traits, this one is no
longer used outside of revisionstore. Let's stop exporting it.
Reviewed By: kulshrax
Differential Revision: D15284473
fbshipit-source-id: 61d87a1843cfed51dddce36e314128ac8244a2dc
Summary: The edenapi is now independant of the storage type for history data.
Reviewed By: kulshrax
Differential Revision: D15284355
fbshipit-source-id: 72a5db42bb0fb19ee03155b13914202581ab5966
Summary:
This allows for a MutableHistoryPack to be used where a MutableHistoryStore
will be required. Once an IndexedLog based history store is implemented we will
be able to switch between the 2 more easily.
Reviewed By: kulshrax
Differential Revision: D15284356
fbshipit-source-id: 91d75ddc6991c26eace67d77679bb8d5806cf8b8
Summary: This will help in abstracting the kind of store that is being written to.
Reviewed By: kulshrax
Differential Revision: D15284358
fbshipit-source-id: ab6a6d23978480ca65587b745ae39ac6ed98cca9
Summary:
This will allow to transparently use the IndexedLogDataStore or a datapack in
the edenapi code.
Reviewed By: kulshrax
Differential Revision: D15266194
fbshipit-source-id: 6396118a5c8107a8c91e5fc83fe4297d4321d10c
Summary:
This will be used to abstract writing to a MutableDataPack or
IndexedLogDataStore (or both).
Reviewed By: kulshrax
Differential Revision: D15266193
fbshipit-source-id: 99f2383555addbafea81a2752e8d6759a1c1c5e7