Commit Graph

32 Commits

Author SHA1 Message Date
Jun Wu
e7ec72f8f1 mutationstore: asyncize more places
Summary:
Make mutationstore more friendly to async.
This resolves an issue with smartlog with the lazy commit hash backend.

Reviewed By: sfilipco

Differential Revision: D27583844

fbshipit-source-id: 5b0b0b9b8ab82399f80eb2b410a0c4b84bd6a444
2021-04-13 16:37:46 -07:00
Lukas Piatkowski
ad106958f2 eden/scm/lib: autogenerate all Cargo.toml files with autocargo
Summary: This diff removes the split between manually managed and autocargo managed Cargo.toml files in `eden/scm/lib`, now all files are autogenerated.

Reviewed By: quark-zju

Differential Revision: D26830884

fbshipit-source-id: 3a5d8409a61347c7650cc7d8192fa426c03733dc
2021-03-05 04:29:49 -08:00
Jun Wu
3ced6804e4 dag: ensure MetaSet has Hints set
Summary:
Previously, `MetaSet` was constructed with default `Hints`. That disables fast
paths. Revise the API so MetaSet requires an explicit `Hints` to address the
issue.

Reviewed By: sfilipco

Differential Revision: D26203557

fbshipit-source-id: 9e7658af8723b06d0efdcad1ab4671c79e907326
2021-02-05 11:53:46 -08:00
Jun Wu
dcf4957619 dag: make parent function async
Summary: Make the parent function used by various graph building functions async.

Reviewed By: sfilipco

Differential Revision: D25353612

fbshipit-source-id: 31f173dc82f0cce6022cc2caae78369fdc821c8f
2020-12-10 12:37:36 -08:00
Jun Wu
461fa77fd7 dag: make Set::flatten async
Summary: This will make it easier to make IdConvert async.

Reviewed By: sfilipco

Differential Revision: D25345239

fbshipit-source-id: 684a0843ae32270aa9b537ef9a2b17a28c027e51
2020-12-10 12:37:34 -08:00
Jun Wu
f854d2e03e dag: make DagAlgorithm async
Summary:
Update DagAlgorithm and all its users to async. This makes it easier to make
IdConvert async.

Reviewed By: sfilipco

Differential Revision: D25345236

fbshipit-source-id: d6cf76723356bd0eb81822843b2e581de1e3290a
2020-12-10 12:37:34 -08:00
Jun Wu
6c02a90386 dag: make MetaSet accept async evaluate and contains
Summary:
Make it possible to use async functions in MetaSet functions.
It will be used when DagAlgorithm becomes async.

Reviewed By: sfilipco

Differential Revision: D25345229

fbshipit-source-id: 0469d572b56df21fbdbdfae4178377e572adbcda
2020-12-10 12:37:34 -08:00
Jun Wu
0ac5bcef79 mutationstore: make calculate_obsolete async
Summary: This will make it easier if the `Set` passed in requires async.

Reviewed By: sfilipco

Differential Revision: D25345230

fbshipit-source-id: a327d4e5d425b7eb5296b2fbe25c446492aa9ea7
2020-12-10 12:37:34 -08:00
Jun Wu
a03e8f4c55 dag: make DagPersistent and DagAddHeads async
Summary: This makes it easier to make DagAlgorithm async.

Reviewed By: sfilipco

Differential Revision: D25345234

fbshipit-source-id: 5ca4bac38f5aac4c6611146a87f423a244f1f5a2
2020-12-10 12:37:33 -08:00
Jun Wu
32176eca42 dag: add async interface for NameSet
Summary:
Change the main API of NameSet to async. Use the `nonblocking` crate to bridge
the sync and async world for compatibility. Future changes will migrate
Iterator to async Stream.

Reviewed By: sfilipco

Differential Revision: D24806696

fbshipit-source-id: f72571407a5747a4eabe096dada288656c9d426e
2020-12-10 12:37:31 -08:00
Jun Wu
45db3bbf96 mutationstore: add a native path to calculate 'obsolete()'
Summary:
The new path does not calculate the complicated `successorssets`, and is
known to make wez's repo operations significantly faster (which, I suspect is
slowed by a very long chain).

The new code is about 3x faster on my repo too:

  # before
  In [1]: list(repo.nodes('draft()'))
  In [2]: %time len(m.mutation.obsoletenodes(repo))
  CPU times: user 246 ms, sys: 42.2 ms, total: 288 ms
  Wall time: 316 ms
  Out[2]: 1127

  # after
  In [1]: list(repo.nodes('draft()'))
  In [2]: %time len(m.mutation.obsoletenodes(repo))
  CPU times: user 74.3 ms, sys: 7.92 ms, total: 82.3 ms
  Wall time: 82.3 ms
  Out[2]: 1127

Reviewed By: markbt

Differential Revision: D23036063

fbshipit-source-id: afd6ac122bb5d8d513b5cdc033e04d2c377286eb
2020-08-21 13:00:45 -07:00
Jun Wu
78477ad9c5 mutationstore: optimize get_dag
Summary:
Optimize get_dag:
- Avoid parsing mutation entries once they are parsed, by keeping an in-memory
  `parent_map`.
- Pass `heads` to `add_heads` so the segments are less fragmented, cycle break
  helper is more efficient.

The `heads` optimization is effective. Practically this makes `get_dag` about 2x faster.

This has a subtle change on cycle handling - full cycle without any non-cycle heads will
be ignored. Practically cycles are rare so it might be okay.

Together with improvements on the `dag` side, `get_dag` is about 4x faster.

Reviewed By: markbt

Differential Revision: D23036062

fbshipit-source-id: 3dc407b562f7ebf2543a87c5cd651ad6a2339d67
2020-08-21 13:00:45 -07:00
Jun Wu
6fd7a2e582 dag: use concrete error types
Summary:
This is more complex than previous libraries, mainly because `dag` defines APIs
(traits) used by other code, which might raise error type not interested
by `dag` itself. `BackendError::Other(anyhow::Error)` is currently used to
capture types that do not fit in `dag`'s predefined error types.

Reviewed By: sfilipco

Differential Revision: D22883865

fbshipit-source-id: 3699e14775f335620eec28faa9a05c3cc750e1d1
2020-08-06 12:31:57 -07:00
Jun Wu
8d0f48c4da dag: rename some anyhow::Result to dag::Result
Summary:
Prefix some `Result` with `dag::Result`. Since `dag::Result` is just
`anyhow::Result` for now, this does not change anything but makes
it more compatible with upcoming changes.

Reviewed By: sfilipco

Differential Revision: D22883864

fbshipit-source-id: 95a26897ed026f1bb8000b7caddeb461dcaad0e7
2020-08-06 12:31:57 -07:00
Jun Wu
868c2b0108 mutationstore: copy entries automatically on flush
Summary:
Similar to D7121487 (af8ecd5f80) but works for mutation store. This makes sure at the Rust
layer, mutation entries won't get lost after rebasing or metaeditting a set of
commits where a subset of the commits being edited has mutation relations.

Unlike the Python layer, the Rust layer works for mutation chains. Therefore
some of the tests changes.

Reviewed By: markbt

Differential Revision: D22174991

fbshipit-source-id: d62f7c1071fc71f939ec8771ac5968b992aa253c
2020-07-02 13:22:34 -07:00
Jun Wu
2f1d35b06e mutationstore: break cycles for get_dag
Summary: Avoids infinite loops creating the DAG.

Reviewed By: markbt

Differential Revision: D22174978

fbshipit-source-id: ec54665a9d5b88a97fce988456041f9aabc824d6
2020-07-02 13:22:34 -07:00
Jun Wu
d647045a5f mutation: lift "1:1" restriction for get_dag
Summary:
Enforcing 1:1 handling is causing trouble for multiple reasons:
- Test cases like "Metaedit with descendant folded commits" in
  test-mutation.t requires handling non-1:1 entries.
- Currently operations like `metaedit-copy` uses two predecessors (edited,
  copied). It is considered 1:1 rewrite (rewriting 1 commit to
  another 1 commit). However, it does not use multiple records, therefore
  cannot be distinguished from `fold`, which is not 1:1 rewrite and also
  has multiple predecessors. We want to include the `metaedit-copy`
  entries in the 1:1 DAG too.

Therefore lift the 1:1 restriction and let's see how it works.

Reviewed By: markbt

Differential Revision: D22175008

fbshipit-source-id: 84d870dbcc433a0d0e5611a83c93781bfa59d035
2020-07-02 13:22:34 -07:00
Jun Wu
64dc05ab9d dag: move add_heads, flush, add_heads_and_flush to traits
Summary: This allows other kinds of DAG to implement the operations.

Reviewed By: sfilipco

Differential Revision: D21626220

fbshipit-source-id: 896c5ccebb1672324d346dfca6bcac9b4d3b4929
2020-05-27 12:16:47 -07:00
Jun Wu
e0d11803f2 dag: move DagAlgorithm to an independent trait
Summary:
This decouples DagAlgorithm from the IdMap + IdDag backend, making it possible
to support other kinds of backends of DagAlgorithm (ex. a revlog backend).

Reviewed By: sfilipco

Differential Revision: D21626200

fbshipit-source-id: f53cc271a200062e9c02f739b6453e1d7de84e6d
2020-05-27 12:16:45 -07:00
Jun Wu
4f39a8e5a6 mutationstore: add a method that returns a dag
Summary:
Part of the mutation graph (excluding split and fold) can fit in the DAG
abstraction. Add a method to do that. This allows cross-dag calculations
like:

  changelogdag = ... # suppose available by segmented changelog

  # mutdag and changelogdag are independent (might have different nodes),
  # with full DAG operations on either of them.
  mutdag = mutation.getdag(...)
  mutdag.heads(mutdag.descendants([node])) & changelogdag.descendants([node2]) # now possible

Comparing to the current situation, this has some advantages:
- No need to couple the "visibility", "filtered node" logic to the mutation
  layer. The unknown nodes can be filtered out naturally by a set "&"
  operation.
- DAG operations like heads, roots can be performed on mutdag when it's
  previously impossible. We also get operations like visualization for free.

There are some limitations, though:
- The DAG cannot represent non 1:1 modifications (fold, split) losslessly.
  Those relationships are simply ignored for now.
- The MemNameDag is not lazy. Reading a long chain of amends might be slow.
  For most normal use-cases it is probably okay. If it becomes an issue we
  can seek for other solutions, for example, store part of mutationstore
  directly in a DAG format on disk, or have fast paths to bypass long
  predecessor chain calculation.

Reviewed By: DurhamG

Differential Revision: D21486521

fbshipit-source-id: 03624c8e9803eb1852b3034b8f245555ec582e85
2020-05-13 09:45:24 -07:00
Mark Thomas
c05efd8a5c mutationstore: move MutationEntry type to types crate
Summary: Move the MutationEntry type to the Mercurial types crate.  This will allow us to use it from Mononoke.

Reviewed By: quark-zju

Differential Revision: D20871338

fbshipit-source-id: 8de3bb8a2673673bc4c8a6cc7578a0a76358c14a
2020-04-23 08:58:10 -07:00
Mark Thomas
5666399fcf mutationstore: switch mutation entry timestamp from f64 to i64
Summary:
The mutation store stores entries with a floating-point timestamp.  This
pattern was copied from obsmarkers.

However, Mercurial uses integer timestamps in the commit metadata (the
parser supports floats for historical reasons, but only stores integer
timestamps).   Mononoke also uses integer timestamps in its `DateTime`
type.

To keep things simple, switch to using integer timestamps for mutation
entries.  Existing entries with floating point timestamps are truncated.

Add a new entry format version that encodes the timestamp as an integer.
For now, continue to generate the old version so that old clients can
read entries created by new clients.

Reviewed By: quark-zju

Differential Revision: D20444366

fbshipit-source-id: 4d6d9851aacb314abea19b87c9d0130c47fdf512
2020-03-17 04:18:44 -07:00
Mark Thomas
ac80212e8f mutationstore: remove mutation entry origins
Summary:
Tracking the origin of mutation entries did not prove useful, and just creates
an un-necessary overhead.  Remove the tracking and repurpose the field as a
version field.

Reviewed By: quark-zju

Differential Revision: D20444365

fbshipit-source-id: 65ff11ee8cfe77d5e67a83d03a510541d58ef69b
2020-03-17 04:18:44 -07:00
Jun Wu
75e4ffc17f indexedlog: change IndexDef.lag_threshold back from entries to bytes
Summary:
I thought the index function could be the bottleneck. However, the Log reading
(xxhash, decoding vlqs) can be much slower for very long entries. Therefore
using bytes as the lag threshold is better. It does leaked the Log
implementation details (how it encodes an entry) to some extend, though.

Reverts D20042045 and D20043116 logically. The lagging calculation is using
the new Index::get_original_meta API, which is easier to verify correctness
(In fact, it seems the old code is wrong - it might skip Index flushes if
sync() is called multiple times without flushing).

This should mitigate an issue where a huge entry (generated by `hg trace`) in
blackbox does not get indexed in time and cause performance regressions.

Reviewed By: DurhamG

Differential Revision: D20286508

fbshipit-source-id: 7cd694b58b95537490047fb1834c16b30d102f18
2020-03-05 13:29:48 -08:00
Jun Wu
c417232b1b mutationstore: update lag_threshold
Summary:
D20042045 changes the meaning of "lag_threshold". Update the value in mutation
store accordingly.

Reviewed By: DurhamG

Differential Revision: D20043116

fbshipit-source-id: 154e6dc2aa88ab0a9a9b21929ae5fa6163dcd403
2020-02-28 09:23:59 -08:00
Mark Thomas
18ecb01b8a mutationstore: update tests so that user is now a string
Summary: D19649887 changed mutation entry users to be strings.  Update the tests accordingly.

Reviewed By: simpkins

Differential Revision: D19656792

fbshipit-source-id: fcff677099dc0200130bf30eadaaf66822c6139c
2020-01-30 19:54:45 -08:00
Durham Goode
b567c16b60 py3: make mutation markers 'user' utf8
Summary: Username as utf8, so let's make mutationmarker treat them as such.

Reviewed By: xavierd

Differential Revision: D19649887

fbshipit-source-id: 3f8b2db434a57ee8ee3017de8d925c19a2002b20
2020-01-30 15:22:24 -08:00
Jun Wu
b0f061f66b mutationstore: implement indexedlog::DefaultOpenOptions
Summary: This implements `MutationStore::repair` for free.

Reviewed By: xavierd

Differential Revision: D18737912

fbshipit-source-id: 098e22bf4ab94af96be9b0d54914abf880c91f74
2019-12-06 19:35:04 -08:00
David Tolnay
d1d8fb939a Switch from failure::Fail trait to std::error::Error for errors
Summary:
This diff replaces eden's dependencies on failure::Error with anyhow::Error.

Failure's error type requires all errors to have an implementation of failure's own failure::Fail trait in order for cause chains and backtraces to work. The necessary methods for this functionality have made their way into the standard library error trait, so modern error libraries build directly on std::error::Error rather than something like failure::Fail. Once we are no longer tied to failure 0.1's Fail trait, different parts of the codebase will be free to use any std::error::Error-based libraries they like while still working nicely together.

Reviewed By: xavierd

Differential Revision: D18576093

fbshipit-source-id: e2d862b659450f2969520d9b74877913fabb2e5d
2019-11-22 08:53:31 -08:00
David Tolnay
9c6f253858 rust: Replace derive(Fail) with derive(Error)
Summary:
This diff replaces code of the form:

```
use failure::Fail;

#[derive(Fail, Debug)]
pub enum ErrorKind {
    #[fail(display = "something failed {} times", _0)]
    Failed(usize),
}
```

with:

```
use thiserror::Error;

#[derive(Error, Debug)]
pub enum ErrorKind {
    #[error("something failed {0} times")]
    Failed(usize),
}
```

The former emits an implementation of failure 0.1's `Fail` trait while the latter emits an impl of `std::error::Error`. Failure provides a blanket impl of `Fail` for any type that implements `Error`, so these `Error` impls are strictly more general. Each of these error types will continue to have exactly the same `Fail` impl that it did before this change, but now also has the appropriate `std::error::Error` impl which sets us up for dropping our various dependencies on `Fail` throughout the codebase.

Reviewed By: Imxset21

Differential Revision: D18523700

fbshipit-source-id: 0e43b10d5dfa79820663212391ecbf4aeaac2d41
2019-11-14 22:04:38 -08:00
David Tolnay
b1793a4416 rust: Rename Fallible<T> to Result<T>
Summary:
This diff is preparation for migrating off of failure::Fail / failure::Error for errors in favor of errors that implement std::error::Error. The Fallible terminology is unique to failure and in non-failure code we should be using Result<T>. To minimize the size of the eventual diff that removes failure, this codemod replaces all use of Fallible with Result by:

- In modules that do not use Result<T, E>, we import `failure::Fallible as Result`;
- In modules that use a mix of Result<T, E> and Fallible<T> (only 5) we define `type Result<T, E = failure::Error> = std::result::Result<T, E>` to allow both Result<T> and Result<T, E> to work simultaneously.

Reviewed By: Imxset21

Differential Revision: D18499758

fbshipit-source-id: 9f5a54c47f81fdeedbc6003cef42a1194eee55bf
2019-11-14 14:11:01 -08:00
Adam Simpkins
ab3a7cb21f Move fb-mercurial sources into an eden/scm subdirectory.
Summary:
In preparation for merging fb-mercurial sources to the Eden repository,
move everything from the top-level directory into an `eden/scm`
subdirectory.
2019-11-13 16:04:48 -08:00