Summary: The edenapi is now independant of the storage type for history data.
Reviewed By: kulshrax
Differential Revision: D15284355
fbshipit-source-id: 72a5db42bb0fb19ee03155b13914202581ab5966
Summary:
This allows for a MutableHistoryPack to be used where a MutableHistoryStore
will be required. Once an IndexedLog based history store is implemented we will
be able to switch between the 2 more easily.
Reviewed By: kulshrax
Differential Revision: D15284356
fbshipit-source-id: 91d75ddc6991c26eace67d77679bb8d5806cf8b8
Summary: This will help in abstracting the kind of store that is being written to.
Reviewed By: kulshrax
Differential Revision: D15284358
fbshipit-source-id: ab6a6d23978480ca65587b745ae39ac6ed98cca9
Summary:
The type of store where data is stored is now fully abstracted to the python
bindings. For now, edenapi will write to the pending mutabledatapack, but we
can now switch it easily to any other store implementing MutableDeltaStore,
including an IndexedLogDataStore.
Reviewed By: kulshrax
Differential Revision: D15266191
fbshipit-source-id: 638cf90a567ef170e0302376312c4b82e6d6b6da
Summary:
This will allow to transparently use the IndexedLogDataStore or a datapack in
the edenapi code.
Reviewed By: kulshrax
Differential Revision: D15266194
fbshipit-source-id: 6396118a5c8107a8c91e5fc83fe4297d4321d10c
Summary:
This will be used to abstract writing to a MutableDataPack or
IndexedLogDataStore (or both).
Reviewed By: kulshrax
Differential Revision: D15266193
fbshipit-source-id: 99f2383555addbafea81a2752e8d6759a1c1c5e7
Summary: As we add more functionality to the Eden API, we will have a lot more request structs. These structs are only used by the HTTP data fetching code, and should not be used by actual business logic. As such, while these types need to be public (so that both Mononoke and Mercurial can use them), they should not be re-exported at the top level.
Reviewed By: quark-zju
Differential Revision: D15268439
fbshipit-source-id: e7d1405d2ac234892baedbf7dbf3e133d187cb45
Summary:
A tree manifest entry must always end with a line feed. It is somewhat
redundant but that's how the serialization is defined. Sometimes that last
line feed is missing in our data. I don't know why.
Reviewed By: quark-zju
Differential Revision: D15110860
fbshipit-source-id: c4ac5075e22a8b8851f6b246d22af8ab68f42a74
Summary:
This is a quality of life improvement for working with the storage layer.
We probably don't gain a whole lot by statically linking the store and it is
useful to have some flexibility in the storage layer.
Differential Revision: D15110859
fbshipit-source-id: 6102acafa21dd1dbaeed0f8fc3147538a8c301d1
Summary:
When remotefilelog.fetchpacks is enabled, it's possible that 100 packfiles of
100MB each are present. In this case, every new packfiles that
hg_memcache_client would write will force an incremental repack, which will
only reduce the number of packfiles by a small number.
Let's have a simple heuristic that tries to bring the number of packfiles to be
lower than 50.
Reviewed By: DurhamG
Differential Revision: D15203771
fbshipit-source-id: 18c39487d5ac087d4879004993c1c1add087249c
Summary:
Add simple algorithms to select all ancestors of a single node, or calculate a
"random" gca of two nodes.
Mercurial supports more "advanced" operations, like calculating ancestors of
multiple nodes, or calculating all ancestors of more than 2 nodes. We'll
see if those are necessary and maybe build fast or slow paths for them.
Reviewed By: sfilipco
Differential Revision: D15055347
fbshipit-source-id: c8c2bac2797d0389adb58c89b67e3ddfb62eb06f
Summary:
We now have enough building blocks to put things together.
The tests are taken from the slides. Both examples and some corner cases, plus
two maybe-interesting synthetic cases.
There are probably more details to test. But this should give us some level of
confidence.
Reviewed By: sfilipco
Differential Revision: D15055346
fbshipit-source-id: a76b70fec0ec7e88378830f251f997d147416db0
Summary:
High-level segments are built on top of lower level segments.
We simply scan through them, and greedily pick the longest ones.
Reviewed By: sfilipco
Differential Revision: D15055348
fbshipit-source-id: 3b72dc766abd46669b787187b7d1d5f7171c026a
Summary:
To build flat segments, we take a `get_parents(id) -> [id]` function and an end
`id`. Then scan through the missing ids and try to make them segments greedly.
Reviewed By: sfilipco
Differential Revision: D15055351
fbshipit-source-id: 21a503d4c3894583a314c6dfd4c7b87fafb95d95
Summary:
segment::Dag wants a `get_parents` function that speaks Ids instead of slices,
as segment::Dag works entirely on Ids.
Provide a function to translate `get_parents` on byte slices to Ids.
Reviewed By: sfilipco
Differential Revision: D15055350
fbshipit-source-id: 795367cf809f068c0cad2515af02c93e14960236
Summary:
Assigning IDs affects performance of segments. Therefore
The logic is abstracted in a way that the callsite only needs to provide a
`get_parents(slice) -> [slice]` function, and a slice to begin with. This
is intended to make it reusable for multiple cases:
- drawdag get_parents, for tests
- revlog get_parents
- Mononoke get_parents
Reviewed By: sfilipco
Differential Revision: D15055349
fbshipit-source-id: d6475737eb87f5ab7d7bd8123a8f4ae2b6d108e8
Summary:
This library parses an ASCII DAG. It is similar to mercurial/drawdag.py, which
was added by me in [1].
There are some (intentional) differences from the Python drawdag:
- Stricter. Confusing DAG characters like `+` or crossing lines are forbidden.
- Do not special handle `o` as a name.
- Do not try to be compatible with `hg log -G` output.
- Do not support special comments (yet).
- Support both left to right and bottom to top directions.
This library tries to be abstract. i.e. it does not have actual logic about
how to make a commit. Its intended users are Mononoke and scmdag, which have
different ways to make commits.
Since this is a library that is intended to be used only for tests. I didn't spend too
much effort to optimize its performance.
[1]: https://www.mercurial-scm.org/repo/hg/rev/a31634336471
Reviewed By: kulshrax
Differential Revision: D15039768
fbshipit-source-id: 4c33d44759ecf59aadc3d443a84db07d702dc69b
Summary:
The segment::Dag structure stores all levels of segments. The "segment" concept
is introduced by D14937221.
This diff adds empty structures and the serialization format.
Reviewed By: sfilipco
Differential Revision: D15019662
fbshipit-source-id: 8136acd45dc8526391e94c5ae98b609d4f8b392a
Summary:
This diff adds a new progress reporting framework to the Eden API crate and uses it to power progress bars for HTTP file downloads in Mercurial.
The new `ProgressManager` type is designed to aggregate progress values across multiple concurrent HTTP transfers. The API is currently designed to integrate well with libcurl's progress callback API, allowing all of the curl handles within a curl multi session to concurrently report their progress.
This progress can then be reported (in aggregate) to a user-provided callback. In most cases, this callback will be a Rust wrapper around a callback provided by the Python code. The `EdenAPI` trait and FFI bindings have been updated accordingly to allow optionally passing in a callback for long-running operations.
Lastly, in `remotefilelog`'s Python code, the callback is specified as a Python closure that simply updates the progress bar.
Reviewed By: quark-zju
Differential Revision: D15179983
fbshipit-source-id: ee677b71beff730f91aafe0364124f7ea0671387
Summary: Per title, `hg debughttp` now prints out the hostname that the API server reports rather than the hostname in the URL we used to connect to it. The reason for this is that if the API server is behind a VIP, we get the actual hostname rather than just the VIP URL.
Differential Revision: D15170618
fbshipit-source-id: 9af5480f9987d8ea9c914baf3b62a00ad88d1b32
Summary:
The former is no longer maintained and throws warnings with recent rust
versions.
Reviewed By: singhsrb
Differential Revision: D15109706
fbshipit-source-id: 94479cdedf42c4dd99e35fa8e337d2fc73f74eb5
Summary: Due to the structure of this loop, we were unnecessarily blocking and polling when all curl transfers were already complete. To fix this, move the loop condition check to the middle of the loop.
Reviewed By: quark-zju
Differential Revision: D15124823
fbshipit-source-id: 92b7eee83cbfd62d590c21893f3235e1ca04fcec
Summary: This is the result from running `python ./contrib/fix-code.py $(hg files .)`.
Reviewed By: HarveyHunt
Differential Revision: D15121815
fbshipit-source-id: 994a44e155806252c57c0a3c9c448101d21c6b57
Summary: The former triggers warnings when compiling, and recommend using rand_chacha
Reviewed By: singhsrb
Differential Revision: D15106307
fbshipit-source-id: 58ad62cc96cc8878086d79d83bcdc2075b416375
Summary:
The error-chain crate is un-maintained and triggers warnings when compiling
with new versions of Rust. Let's use the failure crate instead to be consistent
with the other crates.
Reviewed By: singhsrb
Differential Revision: D15106306
fbshipit-source-id: 8edcf9f9aaf4b6e2d5f214b26fed3e72d4f3acd1
Summary:
The later is deprecated in pest, causing the compiler to warn about it. This
also removes a handful of clone operation.
Reviewed By: quark-zju
Differential Revision: D15091596
fbshipit-source-id: 9bd902d9efb9aef3aba55e11b4472653a895bfcd
Summary:
Instead of manually dropping some of the datapack/historypack fields, we can
drop the entire object. This allows implementing the Drop trait more easily.
But, this prevents the code from later using some of the object fields. We can
use replace to move them in a zero-copy fashion.
Reviewed By: DurhamG
Differential Revision: D15076017
fbshipit-source-id: 4831dfcc2005c957862d32eeda02f62796be3afb
Summary:
Use a dedicated Span type so we can enforce reverse ordering and `start <= end`
directly on the Span structure.
The constructor of `SpanSet` becomes more expensive because it recreates the
`Vec`, and sorts it. Practically, hopefully it's fine. Internal logic like union will
not use that constructor.
Some comments and tweaks have been made to make the code easier to read.
There are some performance changes, though:
Before:
intersection 5.030 ms
union 5.920 ms
difference 4.804 ms
After:
intersection 6.036 ms
union 5.426 ms
difference 4.710 ms
`intersection` becomes slower, while `union` and `difference` become a bit faster.
Hopefully the regression is within the acceptable range.
Reviewed By: sfilipco
Differential Revision: D15023651
fbshipit-source-id: ea7845d5d20faf204cfb85c66fc3bd6e25c9fc0c
Summary: This would provide some data about changes around SpanSet.
Reviewed By: sfilipco
Differential Revision: D15023652
fbshipit-source-id: 4cff7d1876fe20cd876f26926f31e018b6c88fd9
Summary:
Complete the IdMap interface so it's usable.
There are 2 possible use patterns:
- On-disk IdMap + In-memory additions. Practically, the server provides an
on-disk map, and the client might assign missing commits on demand. The
client still needs to update the IdMap during pull.
- Everything is on-disk. There are no in-memory additions. This is more complex
because the local commits might become part of the server commits in the
future, and it might require Ids for those commits to be re-assigned.
I haven't decided which way to go exactly. So let's keep the interface flexible
for both.
That said, I do want to reduce the chance of causing filesystem race conditions
for filesystem writes. In this case, both reads and writes should hold a lock.
So a dedicated type is used to encourage the pattern of:
- get the dedicated type (and hold the filesystem lock)
- read, write, sync
Write related methods are not moved to the dedicated type, to cover the
in-memory addition use-case.
Reviewed By: sfilipco
Differential Revision: D15008517
fbshipit-source-id: 5d117ed7f2947aed6ed524a3b5199c071908c4ae
Summary:
There will be lots of algorithms or structures that operate on integers as
commit identities. The source of truth of commit identities are the commit
hashes. Add a map to be able to translate between them.
The map is designed to be sparse, so it can be used as a cache if the map
is moved to server-side.
The map does not take `[u8; 20]` as its value type, with the intention to
support other hash functions. For example, Bonsai Blake2 hashes have 32 bytes.
Since the integer id is in global namespace and can conflict if there
are multiple writers. The interface is designed to make sure an explicit
critical section is needed for write (to filesystem) operations.
Reviewed By: sfilipco
Differential Revision: D15008518
fbshipit-source-id: 9f53aae551c54e1b47b5f837642ea00fca8579c3
Summary:
The spanset is a set of integer spans. It will be used by some DAG related
operations. It'll be used as a subset of mercurial/smartset.py.
Note: smartset.py also has a Python `spanset` structure. That is different
from this Rust spanset in these ways:
- The Rust set does not preserve ordering.
- The Rust set can have multiple spans, instead of just one.
- The Rust set is less abstract (for now). Its set operations (union, etc.)
only work on the same type.
This diff adds some initial functions for it.
Reviewed By: sfilipco
Differential Revision: D15004985
fbshipit-source-id: c2e5e2a80e2e4681c2f443e0d8a83dc97f7be371
Summary: The scmdag library is going to have things related to the commit graph.
Reviewed By: sfilipco
Differential Revision: D15004984
fbshipit-source-id: f274cceeabae4a57985763216572f7cd055f8e07
Summary: Release the GIL during data fetching to allow for progress bars to update properly. The data fetching code is pure Rust and does not interact with the Python interpreter at all, so releasing the GIL here is safe.
Differential Revision: D15051852
fbshipit-source-id: 144da953720951f9a30aadfc2b7fc8c8bc6b14aa
Summary: Reading a comment is easier than trying to figure out the on-disk format.
Reviewed By: kulshrax
Differential Revision: D15056859
fbshipit-source-id: 097ed8bcaa51369aba4bcc9ed1cc95ebd6a67a66
Summary:
Compressing/Decompressing data can be expensive, so avoid doing it when not
needed. I though about using a RefCell but decided on just using mutable
reference as an Entry will always be private to indexedlogdatastore.rs.
Reviewed By: kulshrax
Differential Revision: D15056862
fbshipit-source-id: ac0b811f2df563be86e3ade9abe89476db5d13cc
Summary: This will allow decompression to be done on the fly as opposed to always.
Reviewed By: kulshrax
Differential Revision: D15056860
fbshipit-source-id: 60635c431579fc924a61d08b35688222ec4930bb
Summary:
Delta chains are only created during repack, as every download operation
fetches the full content of the file. Even if we wanted to support them,
interrupted chains adds undesirable complexity as it can lead to chain loops if
we're not careful. Let's just not support delta chains for now to avoid this.
Reviewed By: kulshrax
Differential Revision: D15056861
fbshipit-source-id: 4b0474ce134e946952a70f363190faf50850abe0
Summary: Now that IndexedLog are also in this crate, its name is no longer relevant.
Reviewed By: kulshrax
Differential Revision: D15056502
fbshipit-source-id: cb00c8322ac4ff7da97c8faaec2959e5f68ca4ca
Summary: Add a new config option to toggle file validation.
Differential Revision: D15034687
fbshipit-source-id: 3783ea1dacad9d1e494a5de1388f703db0ed1129
Summary:
I want to give Store a more specific name so that it doesn't get
confused with other Store abstractions that we will add in the
future.
Reviewed By: singhsrb
Differential Revision: D15007383
fbshipit-source-id: 499bcda4aecd5389e3bc1eba5206ba72a69c4c3d
Summary:
`Log::lookup_range` exposes the range query feature provided by `Index`.
The iterator is made double-ended by the way.
Reviewed By: sfilipco
Differential Revision: D14895477
fbshipit-source-id: 6aef0973e009bf8fc6f3b5e5a8f6c54e57c81360
Summary:
The RangeIter is actually faster. The main reason is that it avoids recursion.
RangeIter does require double Vec, which seems like extra overhead. Practically
it does not seem to matter much.
The RangeIter code is also better written than PrefixIter. So let's delete
PrefixIter, and switch prefix lookups to use RangeIter.
Before:
index prefix scan (2B) 89.788 ms
index prefix scan (1B) 72.337 ms
index prefix scan (2B, disk) 102.098 ms
index prefix scan (1B, disk) 90.445 ms
After:
index prefix scan (2B) 76.335 ms
index prefix scan (1B) 54.517 ms
index prefix scan (2B, disk) 91.798 ms
index prefix scan (1B, disk) 67.143 ms
Reviewed By: sfilipco
Differential Revision: D14895478
fbshipit-source-id: 79a01774fb640c78fc5733db82f86f0f9403c960