Summary:
`Node` is a name that is overloaded. It means that in our conversations
we always have to constantly define which Node we are talking about.
Is it the type in Mercurial or the generic term in graphs/trees.
It is just an identifier so we rename it such.
Note that this doesn't remove the Node term yet. I will take care of the
longtail of uses incrementally.
Things that are not renamed:
* linknode
* nodeinfo
* WireHistoryEntry
* cdatapack
Reviewed By: quark-zju
Differential Revision: D18010149
fbshipit-source-id: 83b3911e231e4544848391cd3deb6e44ec2b809d
Summary:
The getancestors is handled by the metadatastore via the getnodeinfo method,
thus the lower level stores don't need to implement it.
Reviewed By: quark-zju
Differential Revision: D17946853
fbshipit-source-id: b516fdad15c96882f1898eb2e94b6ddff353d2bf
Summary:
With loosefiles being gone, we no longer need to repack loosefiles, let's get
rid of the code.
Reviewed By: quark-zju
Differential Revision: D17923874
fbshipit-source-id: 9bc3390d448df4576e4567447a00446d1d1ff717
Summary:
Remove command name prefix matching to make things much easier.
This breaks command completion.
Reviewed By: sfilipco
Differential Revision: D17644675
fbshipit-source-id: c8a866bf0649c4644c8e5630acdceb459a22deb4
Summary: This exposes the repair API to the Python world.
Reviewed By: xavierd
Differential Revision: D17755604
fbshipit-source-id: fb5089a1f0648b18d4a338c3c73e939d5ce37bed
Summary:
The function is now just wrapping a PyErr into a PythonError, let's inline this
bit into the code.
Reviewed By: quark-zju
Differential Revision: D17729379
fbshipit-source-id: ad569cee03497fab710f760d0fa09e0ea61fe208
Summary:
Instead of using KeyError to indicate that data isn't found, let's use an
Option. The Option type better encode that data is missing without having to do
a potentially error prone downcast, this may also enable us to set
RUST_BACKTRACE=1 everywhere as we won't except errors to happen often anymore,
previously, Mercurial will slow to a crawl due to the many KeyError being
thrown around.
I initially wanted to keep the change small to help reviews, but that didn't
really work out, as the dependencies on the `DataStore`/`HistoryStore` traits
are all over the place...
Reviewed By: quark-zju
Differential Revision: D17728486
fbshipit-source-id: de89c4fc441fd12ff37cc248e2230e4a1403ce44
Summary:
Add a client-driven tree prefetching implementation to the Rust manifest code. Unlike the existing prefetch implementation in Python, this one does all computation of which nodes to fetch on the client side using the BFS logic from BfsDiff. The trees are then bulk fetched layer-by-layer using EdenAPI.
This initial version is fairly naive, and omits some obvious optimizations (such as performing fetches of multiple trees concurrently), but is sufficient to demonstrate HTTP tree prefetching in action.
Reviewed By: xavierd
Differential Revision: D17379178
fbshipit-source-id: f17fe99834ad4fec07b4a4ab196928cc4fe91142
Summary:
Change the `Files` iterator in the Rust manifest code to traverse the tree in BFS order, allowing for layer-by-layer prefetching similar to `Diff`. This can substantially speed up walks over the tree when the cache is cold.
As a side-effect, this changes the order in which paths are reported during a manifest walk. (In particular, they are now reported in breadth-first order rather than depth-first order.) This may break things that rely on the existing ordering; as such, we may need to add a sort somewhere if this turns out to be a problem.
Reviewed By: xavierd
Differential Revision: D17645389
fbshipit-source-id: 624e426094a93e206bde4523ea8bd034fe5aeb90
Summary: Remove the DFS diff implementation and replace it with the BFS implementation.
Reviewed By: xavierd
Differential Revision: D17618818
fbshipit-source-id: d486642caae924f866a200d3c82fa5a4cb7d5286
Summary: Python binding for working copy rust library. Walker is initialized with the root of the repo and the python matcher, iteratively returns the matching files in the working copy. Modified Walker and PythonMatcher to allow Walker to have Send trait.
Reviewed By: xavierd
Differential Revision: D17403235
fbshipit-source-id: b8b84928aac7c79c4388a8ba8aa5475aac0c5219
Summary:
Remove py from python matcher, because of RC, so python matcher will have send trait.
The python matcher needs to be stored by the python walker, and pyclasses can only
store data that is Send + 'static. Python pathmatcher methods should only be called
by python methods.
Reviewed By: xavierd
Differential Revision: D17511705
fbshipit-source-id: 00a4938fb00c30244ae04cb38362e8875c72fa47
Summary:
Getting "parents" in revlogindex is used in a very hot loop. So avoiding
allocation matters. With this patch, it's roughly 5x faster, and matches
C code doing whole changelog scan.
Before:
In [3]: %time cl.index2.headsancestors([1,len(cl)-1])
CPU times: user 330 ms, sys: 68 µs, total: 330 ms
Wall time: 330 ms
Out[3]: [5584666]
After:
In [3]: %time cl.index2.headsancestors([1,len(cl)-1])
CPU times: user 52.9 ms, sys: 0 ns, total: 52.9 ms
Wall time: 53 ms
Out[3]: [5584665]
C code doing whole changelog scan:
In [5]: %time cl.index.clearcaches(); cl.index.headrevs(); 1
CPU times: user 54.2 ms, sys: 187 µs, total: 54.4 ms
Wall time: 54.4 ms
`smallvec` was not used, as it has extra overhead tracking whether it's
stack or heap allocated, which makes it 2x slower than this diff.
Reviewed By: sfilipco
Differential Revision: D17581248
fbshipit-source-id: cf6e36e0000759f41410f1e3a1d252920711fb79
Summary: This will be used to store "invisible heads".
Reviewed By: sfilipco
Differential Revision: D17264837
fbshipit-source-id: a450b5c10cc961d43ec8eb852cb2fb22849a8c00
Summary:
The SpanSet can include a large amount of revs. Iterating through it by putting
everything in a PyList is suboptimal. Therefore add a dedicated native iterator
for it. This speeds up iteration greatly, which can be verified via debugshell:
Before:
In [1]: s=m.smartset.spansset(b.dag.spans(xrange(5000000)))
In [2]: %time s.slice(0,10)
CPU times: user 135 ms, sys: 42.9 ms, total: 178 ms
Wall time: 180 ms
After:
In [1]: s=m.smartset.spansset(b.dag.spans(xrange(5000000)))
In [2]: %time s.slice(0,10)
CPU times: user 49 µs, sys: 6 µs, total: 55 µs
Wall time: 58.2 µs
Reviewed By: sfilipco
Differential Revision: D17305350
fbshipit-source-id: 0db00aa57fb6bf2141ccea94b2536da78f103cef
Summary:
Global states (For example, the global blackbox instance, potentially some
logging / tracing libraries) are separate in the Rust and Python worlds.
That is because related code gets compiled separately:
bindings.so (top-level)
\_ blackbox
hgmain (top-level)
\_ blackbox (have a different global instance than the above blackbox)
To address it, make `bindings` a builtin module in `hgmain`.
The builtin module was renamed from `edenscmnative.bindings` to `bindings` so
it does not require importing anything else (For example, `edenscmnative`).
This unfortunately makes `hg` 100+ MB. Fortunately it can be compressed well
(gzip: 31MB).
Reviewed By: singhsrb
Differential Revision: D17429688
fbshipit-source-id: bf16910d7a260ca58db0d272fc95d8071d47bbc6
Summary:
The RemoteDataStore is expected to be implemented into the higher level types,
while we don't want low level ones to pretend to have a prefetch method.
Reviewed By: quark-zju
Differential Revision: D17437893
fbshipit-source-id: 52ec90a6edf9aa5dac852fb827275be7fd361080
Summary:
The revisionstore crate will soon need to build a `dyn EdenApi` object to fetch
data out of the network. Since the edenapi crate depends on revisionstore, this
would create a recursive dependency between these 2. Several approach were
thought of, including moving either the EdenApi, or the MutableDeltaStore trait
outside of their respective crates. In the end, I decided to remove the
dependency altogether and let the caller decide what to do about the data.
Reviewed By: quark-zju
Differential Revision: D17437895
fbshipit-source-id: cc3ec830562c0616d40d7d5d36f69674934d87b9
Summary: These no longer needs to be passed in with &mut.
Reviewed By: quark-zju
Differential Revision: D17379054
fbshipit-source-id: b1d0591013d92aaa3cc60cc3b23f42a1f175d1cb
Summary:
Per team meeting, we want to remove whole changelog scans that are incompatible
with the upcoming dag changes.
Heads calculation is one of such "whole changelog scans".
The plan is to use visibility heads + remote names to answer `head()`. However,
remote names are not guarnateed to be heads. For example, `stable` might be
an ancestor of `master`. To get the right answer about `head()`, some
calculation like `heads(::(remotenames() + visible-heads()))` needs to be done.
Calculating `heads(ancestors(...))` in Python is quite slow. This diff provides
a native fast path for it. It still requires a partial changelog scan, but will be
compatible with the future dag-based commit graph.
Reviewed By: sfilipco
Differential Revision: D17199841
fbshipit-source-id: 6ea4367b8877209899d56094f8d8ee1aff1ad6f3
Summary:
Add a function to do "head-based phases" calculation on the revlog. So we can
experiment the breaking change, since phases are no longer root-based, and
are probably defined by remotenames and visibility heads.
The segmented changelog structure will drop support for root-based phases for
performance.
Reviewed By: sfilipco
Differential Revision: D17199844
fbshipit-source-id: 4a4dba183bb5f751b0cf454b9fc2b7e601e8c491
Summary:
This module is inteneded to have native paths for some operations that need to
scan the whole changelog. It allows us to experiment some breaking changes,
namely, head-based visibility without "filtered revs", head-based phases on
the revlog format, before the more advanced structure taking over.
This diff adds a revlog index reader that can answer do simple queries like
"length", "parents".
Reviewed By: sfilipco
Differential Revision: D17199837
fbshipit-source-id: 2574f64c980419fa966200fd52fa5ddf873baae4
Summary:
Expose more methods in Rust to Python.
As we're here, change `__contains__` to take a signed int so `-1 in set` test
won't trigger an error.
Reviewed By: sfilipco
Differential Revision: D17244562
fbshipit-source-id: 0b8b9069bd0a35615066d1328933ca50b09b4a25
Summary:
Previously, `SyncableDag` and `Dag` can co-exist. Dropping SyncableDag involves
error handling and is not panic-free. If we want to make sure `Dag` has complete
high-level segments, then it would have been implemented in `SyncableDag::drop`,
making it more sensitive to panic.
Change the API so `SyncableDag` is independent from `Dag`, so `Dag` always
has complete segments, and changes to `SyncableDag` are invisible to `Dag`,
so `SyncableDag` cannot mess up existing `Dag` structures.
Reviewed By: sfilipco
Differential Revision: D17000969
fbshipit-source-id: 1ceed4ea335d3d64848b7430d48076846b90695d
Summary: This makes it possible to decode VLQ from a stream.
Reviewed By: alexeyqu
Differential Revision: D17404066
fbshipit-source-id: 4a3b0e5333664c3cfc0f76bbdc7db80c25a3a49c
Summary:
Previously, the `Dag` has 2 low-level `build_segemnts` APIs:
- Dag::build_flat_segments(..., last_threshold)
- Dag::build_high_level_segments(..., drop_last)
They allow customization about whether the segments are lagging or not.
However, certain algorithms (ex. children and range) now require the high level
segments to cover everything covered by the flat segments. The above APIs
wouldn't ensure that.
This diff refactors the segment building APIs so that:
- Make `build_flat_segments`, and `build_high_level_segments` private to
prevent misuse.
- Ensure high level segments cover flat segments at `Dag::open` and
`Dag::build_segments_volatile`, the only ways to change `Dag`.
- Provide different APIs suitable for different (one-time in-memory vs
on-disk) use-cases. The on-disk `build_segments_persistent` API makes high
level segments lagging to avoid fragmentation, while the in-memory
`build_segments_volatile` does not.
To satisfy the existing test need, a `set_segment_size` API was added to
override the default segment size.
Most callsites become simpler because they no longer need to figure out
details about segment size, level, and lagging.
Reviewed By: sfilipco
Differential Revision: D17000965
fbshipit-source-id: 78bb0c7674c99e91be6011bb7e623cd4f63b1521
Summary:
The parking_lot crate is more convenient to use than std::sync, on top
of everything else listed at https://crates.io/crates/parking_lot. Let's
use it everywhere.
Reviewed By: quark-zju
Differential Revision: D17337444
fbshipit-source-id: b5489be0b7d2bd5f6a6edc5d1d6eea366a6c05b9
Summary: Add support for calling the new BFS diff implementation from Python. This diff adds the appropriate glue code to the bindings crate and adds a config option (`treemanifest.bfsdiff`) to enable the new functionality.
Reviewed By: xavierd
Differential Revision: D17334739
fbshipit-source-id: 24aac21910e74a42d625c93bed7fa3aa08e167c0
Summary:
Split the crate to improve build time.
Before this change, a naive change on any of the simple modules can still take
20+ seconds to compile, even with incremental compilation enabled.
This diff splits the crate into multiple smaller crates. A simple change to a
simple crate can take < 10 seconds to re-compile.
Different from pre-D13923866 state, there is still only one single Python
extension.
Reviewed By: xavierd
Differential Revision: D17345706
fbshipit-source-id: c7e2e6f0e1b86071c863cfb8989070a581825956
Summary:
This just moves things around. So native and pure Python modules are split to
different Python packages. This makes it possible to use the standard zip
importer without hacks (ex. `hgdemandimport/embeddedimport`).
This diff is mostly about moving things. While `make local` still works,
it does break nupkg build, which will be fixed in a later diff.
Reviewed By: kulshrax
Differential Revision: D15798642
fbshipit-source-id: 5d83f17099aa198df0acd5b7a99667e2f35fe7b4