Summary:
In the future the changelog would need to access 'remotenames' to figure out
public heads. Move the state file to svfs so it can be read by the changelog
object, which only has access to svfs.
Reviewed By: sfilipco
Differential Revision: D17199834
fbshipit-source-id: 9000e0d8e8bb8d398d6c77b5b395da904fef6418
Summary:
This will be used in a later change.
I found ChecksumTable::clone crash when the file is empty. That was fixed as
part of the change.
Reviewed By: sfilipco
Differential Revision: D17379311
fbshipit-source-id: 0e43e8559f0628008e10f1e619a0523474c4094f
Summary:
SyncableDag might have lagging high-level segemnts that break the "high-level
segements cover all flat segments" assumption by some DAG algorithms (children,
range).
Remove Deref so those DAG algorithms cannot run on a SyncableDag.
Reviewed By: sfilipco
Differential Revision: D17000967
fbshipit-source-id: 953589ddd8388340a2913b530f442028810296de
Summary: After D17000965, the parameter is no longer used.
Reviewed By: sfilipco
Differential Revision: D17379312
fbshipit-source-id: c6a12ee27a54efcec5caa1769f04095e16dd5995
Summary:
Previously, the `Dag` has 2 low-level `build_segemnts` APIs:
- Dag::build_flat_segments(..., last_threshold)
- Dag::build_high_level_segments(..., drop_last)
They allow customization about whether the segments are lagging or not.
However, certain algorithms (ex. children and range) now require the high level
segments to cover everything covered by the flat segments. The above APIs
wouldn't ensure that.
This diff refactors the segment building APIs so that:
- Make `build_flat_segments`, and `build_high_level_segments` private to
prevent misuse.
- Ensure high level segments cover flat segments at `Dag::open` and
`Dag::build_segments_volatile`, the only ways to change `Dag`.
- Provide different APIs suitable for different (one-time in-memory vs
on-disk) use-cases. The on-disk `build_segments_persistent` API makes high
level segments lagging to avoid fragmentation, while the in-memory
`build_segments_volatile` does not.
To satisfy the existing test need, a `set_segment_size` API was added to
override the default segment size.
Most callsites become simpler because they no longer need to figure out
details about segment size, level, and lagging.
Reviewed By: sfilipco
Differential Revision: D17000965
fbshipit-source-id: 78bb0c7674c99e91be6011bb7e623cd4f63b1521
Summary:
Parents are ordered and the order can be important. The flat segments preserve
the order but there are no APIs exposing the parent order. Define one.
Reviewed By: markbt
Differential Revision: D17000966
fbshipit-source-id: 66beb53d9cef651a53391707c0b690a1e3b76ce2
Summary: See the previous diff for motivation.
Reviewed By: sfilipco
Differential Revision: D17000964
fbshipit-source-id: 3bf112e4b608cf06bf9082d993635df1d65a81fa
Summary:
Segmented changelog wants:
- Entries stored on disk is lagging (i.e. the last segment per each level is
dropped to avoid fragmentation)
- Entries in memory is not lagging.
To ensure that, the easiest way seems to be:
- Upon initializing the dag structure, build missing segments in memory.
- When preparing writing to disk, drop the in-memory segments.
This diff makes it possible to drop in-memory index changes.
Reviewed By: sfilipco
Differential Revision: D17000970
fbshipit-source-id: 554746c9a868cc3881dd836dc3a3cda265b54287
Summary:
Higher level segments need to be producing fewer segments. If they cannot do that,
there is no point creating those levels.
This makes it possible to auto-detect the maximum segment level.
Reviewed By: sfilipco
Differential Revision: D17000968
fbshipit-source-id: bfb35f7451a40612803cb232be163db12bcab2fd
Summary:
The fast path takes advantage of high-level segments and avoids visiting lower
level segments if possible.
`cargo bench --bench dag_ops children` before:
children 16.842 s
after:
children 427.577 ms
Reviewed By: markbt
Differential Revision: D16976385
fbshipit-source-id: ae1b5f26b6ce2213f5ce1f28c4ac1cf49678992f
Summary: This allows us to check if the "HAS_ROOT" flag is correctly set or not.
Reviewed By: sfilipco
Differential Revision: D16976384
fbshipit-source-id: a5bb73b0bc414e422afd7c8690c55ee1076ef9f2
Summary:
This would allow us to implement the fast path in `children`.
This is a breaking change in file format. But we haven't been really using the
segmented dag format in production yet. So it's fine to not have a migration
path.
This potentailly makes the future format migration easier since there are 7
bits left to use.
Reviewed By: markbt
Differential Revision: D16976382
fbshipit-source-id: 0110c03d4ff2ce0a882b3fb8ce8ee12d2087f41f
Summary:
The children operation is now O(flat segments) = O(merges). But it has
potential to be faster with more precalculated information in segments.
See added comments for details.
Reviewed By: markbt
Differential Revision: D16976383
fbshipit-source-id: 80bfd2cefa5aa5ceabfd7e40d46d06f7e2b64d34
Summary:
This makes dag_ops benchmark runnable.
Example benchmark result run with `sudo nice -20`:
building segments 489.076 ms
ancestors 378.471 ms
common_ancestors (spans) 495.074 ms
gca_one (2 ids) 607.611 ms
gca_one (spans) 509.784 ms
gca_all (2 ids) 977.349 ms
gca_all (spans) 570.267 ms
heads 248.168 ms
heads_ancestors 248.524 ms
is_ancestor 295.376 ms
parents 246.623 ms
Reviewed By: sfilipco
Differential Revision: D16992731
fbshipit-source-id: 838f9240ca942f442fa395c1cf7bf914c52652c2
Summary:
This will guide optimizations.
Right now, gca_one "large sets" can take forever to run.
The code to build the mozilla DAG was extracted to a single file to be sharable.
Reviewed By: sfilipco
Differential Revision: D16992730
fbshipit-source-id: 1538f5b0098cd06cb179bd556df285055e1d62b6
Summary:
This is the mutable operation that allows other code paths to add spans to a
SpanSet. It will be used in a later change.
Reviewed By: markbt
Differential Revision: D17169483
fbshipit-source-id: be9286e67bfa64e961ee75482432aa61dfc0e6ed
Summary:
We don't use fncache. It does not scale. The test break with narrow-heads.
This diff just removes the fncache test.
Reviewed By: singhsrb
Differential Revision: D17293478
fbshipit-source-id: b41dba333276ad63973b7556dfc400f4b33f6f5d
Summary:
Ondemand has been complaining that their hgcache was growing very quickly, and
looking at logs and scuba data, this appears to be caused by refetching
everything from the network and writing it to the indexedlogdatastore.
Looking at the prefetch code, I realized that the indexedlogdatastore wasn't
present in the repo.fileslogs.shareddatastore list, and therefore may be
omitted from the stores in which we look for data.
Reviewed By: kulshrax
Differential Revision: D17374761
fbshipit-source-id: 9e9f279d4c8154e2491312a57d24bc5fd7da79fc
Summary: This is no longer used, let's remove it.
Reviewed By: quark-zju
Differential Revision: D17337442
fbshipit-source-id: 2272c2662440bc2be2a3ff29ef11fc4e0eb3605f
Summary:
The parking_lot crate is more convenient to use than std::sync, on top
of everything else listed at https://crates.io/crates/parking_lot. Let's
use it everywhere.
Reviewed By: quark-zju
Differential Revision: D17337444
fbshipit-source-id: b5489be0b7d2bd5f6a6edc5d1d6eea366a6c05b9
Summary:
I hit this when editing the dag stack. This resolves a "BUG" in the
test-amend-nextrebase.t test.
Reviewed By: singhsrb
Differential Revision: D17357387
fbshipit-source-id: 309efe34dee71180abdbdb5a9fe2e7b731230051
Summary: Now that several of the submodules in the Manifest crate have their own unit tests, let's de-duplicate the test helper functions across those tests by moving them to a `testutil` submodule (which only gets built in `cfg(test)`).
Reviewed By: xavierd
Differential Revision: D17352890
fbshipit-source-id: 1b7d7cb754ee501def8d1d508f2af5733d548f4d
Summary: Previously, all of the tests for the manifest code were at the top level of this module. The BFS diff module added it own unit tests; for symmetry with this, let's move the tests for the DFS diff into that submodule.
Reviewed By: xavierd
Differential Revision: D17352889
fbshipit-source-id: 9c1685344656d0b2b5af6495bba67929f47d578f
Summary: These structs are generally useful when writing code that traverses manifests. As such, let's move them out of the BFS module into the top level of the tree module so that other code can use them.
Reviewed By: xavierd
Differential Revision: D17352891
fbshipit-source-id: b390ec84a29604dc6eef31a95dba976a5224f5e9
Summary: For symmetry with the BFS diff implementation, move the DFS diff implementation into its own module. This will help unclutter mod.rs.
Reviewed By: xavierd
Differential Revision: D17352892
fbshipit-source-id: 61709cd3e430c8676c529fbbbb76a9775c05053d
Summary: Add support for calling the new BFS diff implementation from Python. This diff adds the appropriate glue code to the bindings crate and adds a config option (`treemanifest.bfsdiff`) to enable the new functionality.
Reviewed By: xavierd
Differential Revision: D17334739
fbshipit-source-id: 24aac21910e74a42d625c93bed7fa3aa08e167c0
Summary:
On WWW, an `hg update` ends up writing ~15GB worth of data onto the
IndexedLogDataStore, which eats up their precious RAM. As a quick workaround,
let's reduce the max number of logs from 10 down to 4, and increase the size of
each log to keep the total expected size around ~10GB.
Ideally, both these values should be able to configured within the hg config,
but since the IndexedLog is written within hg_memcache_client, we would have to
plumb the values onto it. Medium term, hg_memcache_client will be folded into
hg itself, and this change will be much easier by then.
Do the same for the IndexedLogHistoryStore.
Reviewed By: quark-zju
Differential Revision: D17354856
fbshipit-source-id: 0a75953f40e1982eaf43557f7866f089873300db
Summary:
`ascii` was used as the default / fallback, which is not a user-friendly choice.
Nowadays utf-8 dominates:
- Rust stdlib is utf-8.
- Ruby since 1.9 is utf-8 by default.
- Python 3 is unicode by default.
- Windows 10 adds utf-8 code page.
Given the fact that:
- Our CI sets HGENCODING to utf-8
- Nuclide passes `--encoding=utf-8` to every command.
- Some people have messed up with `LC_*` and complained about hg crashes.
- utf-8 is a super set of ascii, nobody complains that they want `ascii`
encoding and the `utf-8` encoding messed their setup up.
Let's just use `utf-8` as the default encoding. More aggressively, if someone
sets `ascii` as the encoding, it's almost always a mistake. Auto-correct that
to `utf-8` too.
This should also make future integration with Rust easier (where it's enforced
utf-8 and does not have an option to change the encoding). In the future we
might just drop the flexibility of choosing customized encoding, so this diff
autofixes `ascii` to `utf-8`, instead of allowing `ascii` to be set. We cannot
enforce `utf-8` yet, because of Windows.
Here is our encoding strategy vs the upstream's:
| item | upstream | | ours | ours |
| | current | ideal | current | ideal |
| CLI argv | bytes | bytes | utf-8 [1] | utf-8 |
| path | bytes | auto [3] | migrating [2] | utf-8 |
| commit message | utf-8 | utf-8 | utf-8 | utf-8 |
| bookmark name | utf-8 | utf-8 | utf-8 | utf-8 |
| file content | bytes | bytes | bytes | bytes |
[1]: Argv was accidentally enforced utf-8 for command-line arguments by a Rust
wrapper. But it simplified a lot of things and is kind of ok: everything that
can be passed as CLI arguments are utf-8: -M commit message, -b bookmark, paths,
etc. There is no "file content" passed via CLI arguments.
[2]: Path is controversial, because it's possible for systems to have non-utf8
paths. The upstream behavior is incorrect if a repo gets shared among different
encoding systems (ex. both Linux and Windows). We have to know the encoding of
paths to be able to convert them suitable for the local system. One way is to
enforce UTF-8 for paths. The other is to keep encoding information stored with
individual paths (like Ruby strings). The UTF-8 approach is much simpler with
the tradeoff that non-utf-8 paths become unsupported, which seems to be a
reasonable trade-off.
[3]: See https://www.mercurial-scm.org/wiki/WindowsUTF8Plan.
Reviewed By: singhsrb
Differential Revision: D17098991
fbshipit-source-id: c0ff1e586a887233bd43cdb854fb3538aa9b70c2
Summary:
It can fail with:
test-fb-hgext-treemanifest-treeonly-copyamend.t ...
--- test-fb-hgext-treemanifest-treeonly-copyamend.t
+++ test-fb-hgext-treemanifest-treeonly-copyamend.t.err
@@ -124,6 +124,7 @@
adding a/b/c/d/e/f/g/h/i/j/file3
fetching tree '' efa8fa4352b919302f90e85924e691a632d6bea0, found via 9f95b8f1011f
11 trees fetched over *s (glob)
+ 5 files fetched over 1 fetches - (5 misses, 0.00% hit ratio) over 0.00s
or:
--- test-fb-hgext-treemanifest-treeonly-copyamend.t
+++ test-fb-hgext-treemanifest-treeonly-copyamend.t.err
@@ -124,6 +124,7 @@
adding a/b/c/d/e/f/g/h/i/j/file3
fetching tree '' efa8fa4352b919302f90e85924e691a632d6bea0, found via 9f95b8f1011f
11 trees fetched over *s (glob)
+ 12 files fetched over 1 fetches - (12 misses, 0.00% hit ratio) over 0.00s
It fails more easily on Ubuntu. But it's also possible on CentOS.
Stabilize the test by allowing the optional output.
Reviewed By: singhsrb
Differential Revision: D17346110
fbshipit-source-id: ca6d1de5163e1b2bcb7bea5c619220d6f5e2c864
Summary:
Split the crate to improve build time.
Before this change, a naive change on any of the simple modules can still take
20+ seconds to compile, even with incremental compilation enabled.
This diff splits the crate into multiple smaller crates. A simple change to a
simple crate can take < 10 seconds to re-compile.
Different from pre-D13923866 state, there is still only one single Python
extension.
Reviewed By: xavierd
Differential Revision: D17345706
fbshipit-source-id: c7e2e6f0e1b86071c863cfb8989070a581825956
Summary: Diffusion does not have local commit information for imported diffs (e.g. imported from GitHub), and it will return a list for such commits. This will break `hg ssl`. We can simply skip it if Diffusion is giving us list.
Reviewed By: quark-zju
Differential Revision: D17334156
fbshipit-source-id: 4c4278de94e24c646a3e789377c12f42adb4307e
Summary: Add a prefetch method to the `remotetreestore` in the `treemanifest` extension, along with the necessary plumbing to call it from Rust code.
Reviewed By: quark-zju
Differential Revision: D17335773
fbshipit-source-id: 2b71638f56ea7e1398348f437d737a599d8be476
Summary: This diff provides an implementation of the diff operation for trees which processes directories in BFS order (i.e., layer by layer). This allows the iterator to perform a bulk prefetch of the changed nodes in each layer at the start of each layer of the traversal. This should hopefully provide a more efficient fetch pattern than the existing implementation, which requires a full prefetch of both trees upfront for reasonable performance.
Reviewed By: xavierd
Differential Revision: D17276971
fbshipit-source-id: 284f1d458f43cb76befe27e85f53a641f29d7550
Summary:
Add a `prefetch` method to the `TreeStore` trait. This will be used by code using the store to signal that certain keys will be accessed soon. The default implementation is a no-op, but in the case of stores where prefetching makes sense (such as stores backed by remote servers), the default implementation can be overridden to include the appropriate prefetching logic.
For now, this change is a no-op, but later in this stack it will be used to signal to the underlying Python data store to perform the appropriate tree fetches via the Eden API. This will be used to support a more efficient pattern of bulk tree fetches during the diff operation.
Reviewed By: sfilipco
Differential Revision: D17276970
fbshipit-source-id: 22a5d847e5be5dbf1b0a74b47587a98d840b8cdc
Summary: `scm-prompt` is a bit special. They didn't trigger those tests when modified.
Reviewed By: wez
Differential Revision: D17346163
fbshipit-source-id: ffafc017373031905cbf1fc2f80a3a8e8a606094
Summary:
This adds the remote server to logging in the lfs extensions, which will let us
know which LFS server we're talking to. This is only collected on batch
requests.
Reviewed By: ikostia
Differential Revision: D17341928
fbshipit-source-id: a458ba3b0a4dce1b3f4ab3ea0d509f9715044f0e
Summary:
I would like to change the length of the displayed hash in scm-prompt
to 8. Why such an impactful diff? Because `hg sl` shows 8 characters, and I
always get confused when the hash in my prompt doesn't match `hg sl`
Reviewed By: wez
Differential Revision: D17312417
fbshipit-source-id: 3d7e4947c8202e93697c232dbd5abd04e7baee96
Summary:
This updates the LFS extension to send a client correlator when connecting to a
LFS server. This might be helpful for troubleshooting.
Reviewed By: quark-zju
Differential Revision: D17319281
fbshipit-source-id: 3549c0710ad010f9566a961abeedfbb5366bf49c
Summary:
`extern crate` is usually no longer needed in 2018 edition of Rust. This diff removes `extern crate` lines from fbcode where possible, replacing #[macro_use] with individual import of macros.
Before:
```
#[macro_use]
extern crate futures_ext;
extern crate serde_json;
```
After:
```
use futures_ext::try_boxfuture;
```
Reviewed By: Imxset21
Differential Revision: D17313537
fbshipit-source-id: 70462a2c161375017b77fa44aba166884ad2fdc3
Summary:
This wrapper was needed to allow internal mutability of stores. Now that the
internal mutability is done inside the store, this wrapper is now redundant and
the code can be simplified significantly.
Reviewed By: quark-zju
Differential Revision: D17278152
fbshipit-source-id: c488208d4875e26e9551deb86a7c22abbda085ef
Summary:
This allows any MutableHistoryStore to be shared and written from multiple
threads.
Reviewed By: quark-zju
Differential Revision: D17278149
fbshipit-source-id: 69f81bb0b182cb27022f13b2e6330b7fc805cbaa
Summary:
This allows any MutableDeltaStore to be shared and written from multiple
threads.
Reviewed By: quark-zju
Differential Revision: D17278153
fbshipit-source-id: 17e1474ca1c6d5285cac7dbf519bfd2d5da6e08d
Summary: This will enable switching MutableHistoryStore to use `&self` instead of `&mut self`.
Reviewed By: quark-zju
Differential Revision: D17278151
fbshipit-source-id: 5a6edde5efb0ada14b994d11f33f0aa48780446e
Summary: This will enable switching MutableHistoryStore to use `&self` instead of `&mut self`.
Reviewed By: quark-zju
Differential Revision: D17278154
fbshipit-source-id: cc66d2874bd86235cd39ce3f5357d155e20ef447
Summary: This will enable switching MutableDeltaStore to use `&self` instead of `&mut self`.
Reviewed By: quark-zju
Differential Revision: D17278155
fbshipit-source-id: e7c5d464fb6ba2b31b07127832104b8bd4062fa0
Summary: This will enable switching MutableDeltaStore to use `&self` instead of `&mut self`.
Reviewed By: quark-zju
Differential Revision: D17278148
fbshipit-source-id: c90f62461f784f4a8efb4e5b0ba0c3e21a6f9f77