Commit Graph

367 Commits

Author SHA1 Message Date
Jun Wu
b6cea95ea5 dag: use Bytes to avoid some VertexName copies
Summary:
This is an example about how to use the new Bytes type. The performance change
is not obviously visible in benchmarks since the bottleneck is not at the bytes
copying.

Reviewed By: DurhamG

Differential Revision: D19818720

fbshipit-source-id: a431ae206cfa4fa08b2e162a48b3d7cbcd900f7f
2020-02-28 09:23:59 -08:00
Jun Wu
76ab726056 dag: switch from bytes to minibytes
Summary: The APIs are compatible so the switch is straightforward.

Reviewed By: DurhamG

Differential Revision: D19818713

fbshipit-source-id: 504e9149567c90eb661804e0dad20580a401aa76
2020-02-28 09:23:59 -08:00
Jun Wu
9e3920ca1c dag: fix benchmarks
Summary: D19559127 forgot those files.

Reviewed By: DurhamG

Differential Revision: D19818715

fbshipit-source-id: 92321492eae89ed9f748800b3bfcc306a54aab20
2020-02-28 09:23:59 -08:00
Jun Wu
c417232b1b mutationstore: update lag_threshold
Summary:
D20042045 changes the meaning of "lag_threshold". Update the value in mutation
store accordingly.

Reviewed By: DurhamG

Differential Revision: D20043116

fbshipit-source-id: 154e6dc2aa88ab0a9a9b21929ae5fa6163dcd403
2020-02-28 09:23:59 -08:00
Jun Wu
1962fd5f5b indexedlog: update lagging indexes at open time
Summary:
Previously indexes are only updated at `sync()` time. This diff makes it so
`open()` can also update lagging indexes. This should make index migration
(ex. D19851355) smoother - indexes are built in time and users suffer less from
the absent of indexes.

Reviewed By: DurhamG

Differential Revision: D20042046

fbshipit-source-id: 20412661a0ca4f5f67b671137c47b6373a42981d
2020-02-28 09:23:58 -08:00
Jun Wu
6da3bdadd2 indexedlog: extract logic writing indexes to disk to a method
Summary: The logic is currently only used by `sync()`. I'd like to reuse it at `open()`.

Reviewed By: DurhamG

Differential Revision: D20042044

fbshipit-source-id: 5c9734ff68bdcf8f8c8710c6a821b18d3afeaca0
2020-02-28 09:23:58 -08:00
Jun Wu
afb24f8a8a indexedlog: change IndexDef.lag_threshold from bytes to entries
Summary:
This is more friendly for indexedlog users - deciding lag_threshold by number
of entries is easier than by bytes.

Initially, I thought checking `bytes` is cheaper and checking `entries` is more
expensive. However, practically we will have to build indexes for `entires`
anyway. So we do know the number of entries lagging behind.

Reviewed By: DurhamG

Differential Revision: D20042045

fbshipit-source-id: 73042e406bd8b262d5ef9875e45a3fd5f29f78cf
2020-02-28 09:23:58 -08:00
Jun Wu
55363a78a7 indexedlog: add API to convert &[u8] to zero-copy Bytes
Summary:
This can be useful for users of indexedlog when they want `Bytes` (to get rid
of the lifetime parameter).

This might be useful for storage layer that wants to take the ownership of the
returned bytes.

Reviewed By: xavierd

Differential Revision: D19818714

fbshipit-source-id: cb2d4e7deff921915e07454fee15cb94a3d5c00d
2020-02-28 09:23:57 -08:00
Jun Wu
556850e715 indexedlog: remove unused mmap utility functions
Summary: Those utilities are no longer necessary since the new code uses Bytes.

Reviewed By: xavierd

Differential Revision: D19818717

fbshipit-source-id: 0b43af0f1eae1a4288e84d4170db058b27f80334
2020-02-28 09:23:57 -08:00
Jun Wu
aaf59c569d indexedlog: replace Mmap with Bytes in Log
Summary: This simplifies the code a bit and makes it cheaper to clone the Log.

Reviewed By: xavierd

Differential Revision: D19818716

fbshipit-source-id: bbf07b8b36009d53b63d8066ec422fc3c3796840
2020-02-28 09:23:57 -08:00
Jun Wu
90ee3cb05a indexedlog: remove ChecksumTable
Summary: It's no longer used since Index now has inlined its checksum logic.

Reviewed By: ikostia

Differential Revision: D19850744

fbshipit-source-id: eb134e4c1613573a2d238710b44ad8119c80a5ee
2020-02-28 09:23:56 -08:00
Jun Wu
a1601bfdd9 indexedlog: bump index filename
Summary:
Change index filename and metadata name. This makes sure the new format and old
format are separate so upgrading or downgrading won't have issues.

Reviewed By: DurhamG

Differential Revision: D19851355

fbshipit-source-id: 25dee018073a90040f5818b32b753a3f589c10e0
2020-02-28 09:23:56 -08:00
Jun Wu
6f4bf325d5 indexedlog: write Checksum inline with Log
Summary:
Enhance the index format: The Root entry can be followed by an optional
Checksum entry which replaces the need of ChecksumTable.

The format is backwards compatible since the old format will be just
treated as "there is no ChecksumTable", and the ChecksumTable will be built on
the next "flush".

This change is non-trivial. But the tests are pretty strong - the bitflip test
alone covered a lot of issues, and the dump of Index content helps a lot too.

For the index itself without ".sum", checksum, this change is bi-directional
compatible:
1. New code reading old file will just think the old file does not have the
   checksum entry, similar to new code having checksum disabled.
2. Old code will think the root+checksum slice is the "root" entry. Parsing
   the root entry is fine since it does not complain about unknown data at the
   end.

However, this change dropped the logic updating ".sum" files. That part is an
issue blocking old clients from reading new data.

Reviewed By: DurhamG

Differential Revision: D19850741

fbshipit-source-id: 551a45cd5422f1fb4c5b08e3b207a2ffe3d93dea
2020-02-28 09:23:55 -08:00
Jun Wu
b9e3046a8d indexedlog: add Checksum entry to Index
Summary:
To solve the soundness issue of ChecksumTable raised by the last diff.
I plan to move Checksum logic to Index. This has multiple benefits:
- Solve the soundness issue of ChecksumTable.
- Indexedlog no longer writes the ".sum" files. `atomic_write` can be quite
  slow (tens of milliseconds) on Windows. So this should help perf - with
  many indexes, it can save hundreds of milliseconds on Windows per
  indexedlog sync.

This diff adds the definition and serialization of the new Checksum entry.
The index format is not updated yet.

Reviewed By: markbt

Differential Revision: D19850742

fbshipit-source-id: df6e6ed12a12ef0d2a782dc9d6b4dc5dec3f4b46
2020-02-28 09:23:55 -08:00
Jun Wu
0f09413ed4 indexedlog: add a broken test showing checksum_table is racy
Summary:
With the last change, mmap cost is reduced, but ChecksumTable is unsound in a
corner case: the buffer to check is shorter than what ChecksumTable covers:

    checksum:  |----chunk----|----chunk----|----chunk--|
    buf:       |-------------------------------|       |
                                               ^       ^
                                        logic len     physical len

The checksum table will be unable to verify the last chunk, since it does not
have enough data in buf.

The issues is exposed by stress testing the multithread sync tests. It's not
always easy to reproduce, though.

Reviewed By: markbt

Differential Revision: D19850745

fbshipit-source-id: a1a96080163b7b9b56dcd6c1673d5d8d10e18a2b
2020-02-28 09:23:55 -08:00
Jun Wu
1e10527482 indexedlog: share Bytes between Index and ChecksumTable
Summary: This avoids some extra mmap syscalls by ChecksumTable.

Reviewed By: xavierd

Differential Revision: D19818721

fbshipit-source-id: dace55193f2b4b0f35e3868781faa2d2998d3b58
2020-02-28 09:23:54 -08:00
Jun Wu
1ece621c4d indexedlog: replace Mmap with Bytes in Index
Summary:
This simplifies the code a bit (no special cases about 0-sized mmap buffers)
and makes it cheaper to clone the index buffer (just an Arc::clone, without
another mmap syscall).

Reviewed By: xavierd

Differential Revision: D19818718

fbshipit-source-id: e96d42af74c7f0bb11703c5da31cdfbd5d76c372
2020-02-28 09:23:54 -08:00
Jun Wu
918672b106 tracing-collector: support owned strings in TreeSpans
Summary:
TreeSpans used to use `&str`, which adds a lifetime to the struct, making it
harder to be used in the Python land. Use a type parameter so TreeSpans<String>
can be used.

Reviewed By: DurhamG

Differential Revision: D19797708

fbshipit-source-id: c66429abfaf16d876151ca6f29da976bed91485d
2020-02-28 09:16:14 -08:00
Jun Wu
4cd7df6a01 tracing-collector: rename structs
Summary:
TreeSpan -> RawTreeSpan; TreeSpanWithMeta -> TreeSpanRef.

I'm going to add a non-reference version of TreeSpanRef.

Differential Revision: D19797701

fbshipit-source-id: 42b04c23d4d0ddbe821b94fa2ccb133ce9eafa05
2020-02-28 09:16:14 -08:00
Jun Wu
957617c8b8 tracing-collector: support filtering in TreeSpans
Summary:
The filtering interface allows callsite to select what they want. It's similar
to manifest walk with files or directory matchers in source control.

Reviewed By: DurhamG

Differential Revision: D19784467

fbshipit-source-id: 5cf6e4016d6fa1c90f8aeccc50809baccd4af5ab
2020-02-28 09:16:13 -08:00
Jun Wu
366e701239 tracing-collector: support Events in TreeSpans
Summary: The idea is that instants (events) can be a drop-in replacement for `ui.log`.

Reviewed By: DurhamG

Differential Revision: D19782897

fbshipit-source-id: 795bbba23d921e460f723f19ef529b203aea366a
2020-02-28 09:16:13 -08:00
Jun Wu
d205592d42 tracing-collector: extract logic finding parent span to a function
Summary: This function will be reused by the next diff.

Reviewed By: DurhamG

Differential Revision: D19782895

fbshipit-source-id: 1e636eabee9b0dffd287a1e6784a24ab2259f51f
2020-02-28 09:16:13 -08:00
Jun Wu
8b5fdc01fc tracing-collector: put treespans into a struct
Summary: This allows us to define methods on the treespans, such as filtering APIs.

Reviewed By: DurhamG

Differential Revision: D19782896

fbshipit-source-id: 2e7bd8344c0196e382728c26a8233abf944bbf29
2020-02-28 09:16:12 -08:00
David Tolnay
37a8401761 rust/thrift: Un-rename futures-preview dependency
Summary: The Thrift generated code depends only on futures 0.3, not 0.1. Thus it isn't necessary to depend on renamed:futures-preview and we can depend on futures-preview directly, which is exposed to Rust code as `futures::`.

Reviewed By: jsgf

Differential Revision: D20145921

fbshipit-source-id: 5cae94ec6747a374c2bf05f124ab237c798de005
2020-02-27 22:27:58 -08:00
Xavier Deguillard
6fac9ebad0 revisionstore: add a get_stripped method to ContentStore
Summary:
This new method returns the content of a blob without the copy-from metadata
header.

Reviewed By: DurhamG

Differential Revision: D20102889

fbshipit-source-id: e96f636b7d30460b59707a2cb700d667e616116a
2020-02-27 12:29:42 -08:00
Jun Wu
bce29c9562 nameset: UnionSet
Summary: Similar to Mercurial's smartset.addset.

Reviewed By: sfilipco

Differential Revision: D19912106

fbshipit-source-id: 0d0c8d0b71d2757259d26295eb4a564fea807dea
2020-02-27 07:34:57 -08:00
Jun Wu
c906a21ce1 nameset: initial NameSet abstraction
Summary:
The NameSet is something similar to SpanSet and Mercurial's smartset but speaks
VertexNames instead of Ids. The idea is, NameSet will be part of NameDag APIs,
and potentially replace Mercurial's smartset layer (just smartset the container
types, not the revset language), in a way that revision numbers are completely
hidden behind the scenes.

This diff adds some basic abstraction around iteration-related operations.
Other operations will be added later.

Reviewed By: sfilipco

Differential Revision: D19912109

fbshipit-source-id: 504a26c074282ec51f260535ca63e943124f688e
2020-02-27 07:34:57 -08:00
Adam Simpkins
0ffcf3e450 update the Rust print_status() function to take an IO parameter
Summary:
Update the `print_status()` function to take a `clidispatch::io::IO` object as
a parameter, instead of a simple output object.  This will allow us to also
print error messages from this function in a future diff.

Reviewed By: quark-zju

Differential Revision: D19958504

fbshipit-source-id: bf482fdc4420e1350363a730c6a539cd760aef25
2020-02-26 14:54:40 -08:00
Adam Simpkins
b22fc79e4b clean up PathRelativizer API usage of Path vs PathBuf
Summary:
Fix the PathRelativizer APIs to accept `Path` and even `str` arguments instead
of just `PathBuf`.  The old code required a `PathBuf`, which often forced
callers to make a copy of the path data.

Reviewed By: quark-zju

Differential Revision: D19958505

fbshipit-source-id: 6fa40dd4b75df4e3faf9ad2ae4f0e4e6595669f6
2020-02-24 15:38:36 -08:00
Xavier Deguillard
934b64397b convert to bytes 0.5
Summary:
The bytes 0.5 is a depencency of newer tokio, it's also newer, and thus better.
Staying on 0.4 means that copies between Bytes 0.4 and 0.5 need to be done,
this will be especially bad in the LFS code since 10+MB buffer will have to be
copied...

One main API change is for the configparser. The code used to take Into<Bytes>
for the keys, I switched it to AsRef<[u8]>.

For hg_memcache_client, an extra copy is performed to build a Delta, since this
code uses an old tokio, and is being replaced right now, the effort of
switching to a new tokio and new bytes was not deemed worth it, the copy will
do for now.

Reviewed By: dtolnay

Differential Revision: D20043137

fbshipit-source-id: 395bfc3749a3b1bdfea652262019ac6a086e61e0
2020-02-24 10:28:46 -08:00
Jun Wu
142937c2f8 cargo: bump serde_cbor to 0.11
Summary: Follow up of D20024491.

Reviewed By: sfilipco

Differential Revision: D20043585

fbshipit-source-id: f66896c8f41c3918fb37611d87fa26c39cdecef1
2020-02-21 14:08:43 -08:00
Xavier Deguillard
44c4f2f5d9 revisionstore: add copyfrom information to the LFS pointer
Summary:
Mercurial filenode hash is computed by including the copy information in the
blob header. Before computing the blob content hash, or returning it to the
upper layers, we need to either strip or reconstruct this header appropriately.

Reviewed By: DurhamG

Differential Revision: D19975887

fbshipit-source-id: 7555e7219e50f4d18ec677fdecc216ee705d7af4
2020-02-20 14:28:52 -08:00
Xavier Deguillard
7fb75ce4f0 lfs: move contenthash computation to the enum impl
Summary: This will make it easier to support more hash schemes in the future.

Reviewed By: DurhamG

Differential Revision: D19975888

fbshipit-source-id: 8b8ce3b20d72199bac3cd20a48475b5ab56bfc52
2020-02-20 14:28:52 -08:00
Xavier Deguillard
cd56a8b39a revisionstore: move Arc outside of the stores
Summary:
With the Arc embedded into the store themselves, this forces a second
allocation in order to use them as trait objects. Since in most cases, we do
not want the stores themselves to be cloneable, we can move the Arc outside and
thus reduce the number of pointer indirection.

Reviewed By: DurhamG

Differential Revision: D19867568

fbshipit-source-id: 9cd126831fe2b9ee715472ac3299b7a09df95fce
2020-02-20 14:28:52 -08:00
Xavier Deguillard
7c1a623d8a revisionstore: add the LfsStore to the ContentStore
Summary:
The ContentStore now can read LFS blobs from both the shared cache, and the
local store.

Reviewed By: DurhamG

Differential Revision: D19866249

fbshipit-source-id: a6fb3523495e9d3832613b56438f631cfa552b91
2020-02-20 14:28:51 -08:00
Xavier Deguillard
58d9d92e88 revisionstore: simplify ContentStore/MetadataStore initialization a bit
Summary:
With the LFS store being added, and the indexedlog being soon used for trees,
this simplification should help in formalizing the hierarchy of files/folders.

It will look like the following:
  <root dir>/lfs: for the lfs store
  <root dir>/indexedlog*: for the indexedlog
  <root dir>/foobar: for a hypothetical foobar store

For manifests, <root dir> will therefore be: <store dir>/manifests. The
unfortunate part is that the current tree data lives under
<store dir>/packs/manifests. As packfiles will be replaced, this small
discrepency is acceptable.

Reviewed By: DurhamG

Differential Revision: D19866248

fbshipit-source-id: 7ef59ef7df19149b19a529b4f4a45a479cc9d23b
2020-02-20 14:28:51 -08:00
Xavier Deguillard
f512b5658d revisionstore: add an LfsStore
Summary:
This is the first step in having a stronger integration between LFS blobs and
the ContentStore abstraction. The 2 main difference between the Python based
LFS implementation and this one are:
 - pointers are not stored alongside plain data,
 - blobs are split between local and shared blobs

As of now, no reclamation is being performed for shared blobs, blobs aren't
fetched or uploaded. This will come in future diffs.

Reviewed By: DurhamG

Differential Revision: D19859291

fbshipit-source-id: 45000fc574e6fbd6d3487f4966cad4f49dab731c
2020-02-20 14:28:51 -08:00
Adam Simpkins
5ffa268af2 use absolute includes for the native cext modules
Summary:
Update the C files under edenscm/mercurial/cext to use absolute includes from
the repository root.  Also update a few of the libraries in edenscm/mercurial
that the cext code depends on.

This makes these files easier to build with Buck in fbsource, and reduces the
number of places where we have to use deprecated Buck functionality to help
find these headers.  This also allows autodeps to work with the build targets
for these rules.

Reviewed By: xavierd

Differential Revision: D19958221

fbshipit-source-id: e6e471583a795ba5773bae5f16ed582c9c5fd57e
2020-02-19 13:05:06 -08:00
Lukas Piatkowski
c4f0887fc2 eden/scm: cover xdiff with autocargo
Summary: Generate the Cargo.toml files inside xdiff with autocargo. This will enable Mononoke to depend on this code easily without sacrificing anything on eden/scm side.

Reviewed By: aslpavel

Differential Revision: D19948741

fbshipit-source-id: 905ff3d64b90830e5f075e4c6ed2b3de959e3f00
2020-02-19 05:15:17 -08:00
Xavier Deguillard
d8064b5e2a types: add a Sha256 type
Summary: This will be used in the LFS store.

Reviewed By: DurhamG

Differential Revision: D19895803

fbshipit-source-id: 4cf447987c10fed0b5c98904f20c841428965d89
2020-02-18 08:32:33 -08:00
Xavier Deguillard
17cc9ab5ab revisionstore: add a wrapper around IndexedLog/RotateLog
Summary:
In some cases, higher level stores may want to store data in either a plain
IndexedLog, or in a RotateLog, for local and shared data. Due to slight
difference between the 2, they can't easily be adapted into a common trait.

Instead let's just wrap both into an enum and implement the main functions that
the higher level stores need.

The first use of this will be the LfsStore, future use will include the
IndexedLogDataStore and the IndexedLogHistoryStores.

Reviewed By: DurhamG

Differential Revision: D19859292

fbshipit-source-id: 920572e0cf5f69bda4901a727a6b0dc0f08fc8d0
2020-02-18 08:32:32 -08:00
Durham Goode
f530333e06 edenfs: update eden thrift types
Summary:
When I run make local it's creating changes in our checked in thrift
types. I guess I need to check these in?

Reviewed By: quark-zju

Differential Revision: D19848706

fbshipit-source-id: 8a2e9a2617734eda41eade1f2645689362b1d75d
2020-02-17 14:52:29 -08:00
Xavier Deguillard
7bb3e384d8 remotefilelog: append the repo name to memcache key
Summary:
Up to now, this has been done in chef, and thus for repos that we do not list,
they may share the memcache keys, with potential unintended consequences. Let's
always add the repo name to the key, so we can simplify the code in chef.

One small negative effect of this change is that while it is being rolled out,
the cache hit rate will be impacted. This should resolve itself quickly.

Reviewed By: DurhamG

Differential Revision: D19885775

fbshipit-source-id: 0b59ce9e378b0ab70f696a39d19d27cd89921098
2020-02-14 14:10:48 -08:00
Xavier Deguillard
28564b228d backingstore: do not fail if memcache can't be initialized
Summary:
Failing means that we fallback to the Python importer. Let's simply warn about
it.

Reviewed By: fanzeyi

Differential Revision: D19897274

fbshipit-source-id: f9c63f5aa76015c28b31f00bba98244f5c86e923
2020-02-14 09:00:27 -08:00
Jun Wu
03baa31789 indexedlog: switch from bytes to minibytes
Summary:
This makes it possible to use `Bytes` for mmap buffers.

The changes are because `minibytes::Bytes` does not implement `From<&[u8]>`
with the intention to make slice copy explicit.

Reviewed By: xavierd

Differential Revision: D19818719

fbshipit-source-id: c34ee451bfd2dc7bcbbcebd52a76444b6c236849
2020-02-12 13:57:37 -08:00
Xavier Deguillard
49c953bc7e backingstore: plumb the MemcacheStore
Summary:
EdenFS will now be able to fetch blobs directly from memcache. This won't have
any big benefits as no blobs are in memcache right now, but over time, this
will significantly reduce the cost of fetching blobs.

Reviewed By: fanzeyi

Differential Revision: D19861643

fbshipit-source-id: c2e9d317bd30d4656bf0b3f8897794161697761a
2020-02-12 13:43:00 -08:00
Xavier Deguillard
a4b83e384a revisionstore: add tracing point for memcache
Summary:
These tracing points will help us understand the memcache hit rate as well as
the fetching speed.

Reviewed By: quark-zju

Differential Revision: D19836499

fbshipit-source-id: 1936c44efc3e7715069e6a959f5331139d591d5c
2020-02-12 10:38:59 -08:00
Xavier Deguillard
2c4a10bf4b revisionstore: move memcache set to a background thread
Summary:
Everytime a cache miss is seen, the data fetched from the server will be sent
directly to memcache for future use. Unfortunately, doing so in a blocking
manner severely impact the overall fetching speed from the server. Since
memcache is purely an optimization, we can afford to send data to it
asynchronously.

Let's move as much as possible of the code to a background thread to reduce the
overhead of memcache.

Reviewed By: DurhamG

Differential Revision: D19836011

fbshipit-source-id: 68e506ef7464d6e99d98457d0d37178f514be1a9
2020-02-12 10:38:59 -08:00
Xavier Deguillard
dc7f7908ef revisionstore: prefetch data with get_iter
Summary:
Instead of fetching data one-by-one, let's prefetch data concurrently by using
the new get_iter function.

Reviewed By: DurhamG

Differential Revision: D19836009

fbshipit-source-id: 4a50328c0cbbba677c2de3777ebe4c34cb10c1e2
2020-02-12 10:38:58 -08:00
Xavier Deguillard
8b082a18f7 revisionstore: don't prefetch with an empty key set
Summary:
Even when memcache would be able to prefetch everything, this would always call
into the underlying remote store with an empt key set. For things like `hg
prefetch` and a large number of keys, the effect of doing that is minimum, but
for EdenFS or `hg log -p`, the roundtrip to the server for every file/revision
would add a significant amount of overhead. Let's simply stop iterating when we
no longer need to fetch anything.

Reviewed By: DurhamG

Differential Revision: D19835797

fbshipit-source-id: 54ad704428c3b20d973cfa87f7171899ec44b3f9
2020-02-11 18:05:16 -08:00
Jun Wu
abb8ccb346 minibytes: add serde support
Summary:
See also https://github.com/serde-rs/bytes/.

This will be used in the `dag` crate.

Reviewed By: DurhamG

Differential Revision: D19770858

fbshipit-source-id: 2a870a564e0ceecdc7a4667853b2b2a5ea4ce6e3
2020-02-07 14:21:39 -08:00
Jun Wu
8029cd3878 minibytes: port benchmark from tokio/bytes
Summary:
Performance looks okay comparing with tokio/bytes v0.5.4:

minibytes:

  test clone_arc_vec        ... bench:      16,542 ns/iter (+/- 1,524)
  test clone_shared         ... bench:      16,211 ns/iter (+/- 596)
  test clone_static         ... bench:       1,437 ns/iter (+/- 502)
  test deref_shared         ... bench:         367 ns/iter (+/- 101)
  test deref_static         ... bench:         366 ns/iter (+/- 1)
  test deref_unique         ... bench:         367 ns/iter (+/- 4)
  test from_long_slicd      ... bench:          91 ns/iter (+/- 18) = 1406 MB/s
  test slice_empty          ... bench:      10,382 ns/iter (+/- 104)
  test slice_short_from_arc ... bench:      23,823 ns/iter (+/- 1,411)

tokio/bytes:

  test clone_arc_vec        ... bench:      16,213 ns/iter (+/- 1,864)
  test clone_shared         ... bench:      18,685 ns/iter (+/- 634)
  test clone_static         ... bench:       3,983 ns/iter (+/- 163)
  test deref_shared         ... bench:         366 ns/iter (+/- 26)
  test deref_static         ... bench:         373 ns/iter (+/- 36)
  test deref_unique         ... bench:         391 ns/iter (+/- 33)
  test from_long_slice      ... bench:          67 ns/iter (+/- 7) = 1910 MB/s
  test slice_empty          ... bench:      15,149 ns/iter (+/- 1,708)
  test slice_short_from_arc ... bench:      36,541 ns/iter (+/- 3,485)

clone_static is faster because minibytes don't call into vtable's clone.
from_long_slice is slower because minibytes uses Arc unconditionally while bytes
can avoid Arc overhead if refcount is 1.

Reviewed By: DurhamG

Differential Revision: D19770857

fbshipit-source-id: 5bafcc57a38c68baccfcafd3906f1a47b2bf4530
2020-02-07 14:21:39 -08:00
Jun Wu
108f1c947a minibytes: minimalist zero-copy Bytes with mmap support
Summary:
This crate provides the core features of the commonly known `Bytes` crate:
zero-copy slicing and cloning, while also supports mmap-backed buffers.

The main motivation is to replace `Mmap` in `indexedlog`. That has multiple
benefits:
- Handles 0-sized mmap more cleanly.
- Handles clones more cleanly.
- Gain the flexibility to zero-copy data without lifetime / reference.
- Gain the flexibility to switch to non-mmap data.

The `bytes::Bytes` crate does not yet support mmap buffers as of its latest
release (0.5.4).

Implementation wise, `minibytes::Bytes` uses `Option<Arc<dyn Trait>>` for the
"trait object". This makes implementing the mmap storage just one line.
`bytes 0.5.4` re-invents the "trait object" manually using unsafe code. It requires
about 50 lines to implement the mmap storage (in D19756122).

Reviewed By: xavierd

Differential Revision: D19770856

fbshipit-source-id: 8cfa7052a18ac2e0cd6348b77d5e2a4acc61195c
2020-02-07 14:21:38 -08:00
Jun Wu
69aa37f23b tracing: limit column width on ASCII output
Summary: This makes the output more readable even if the "name" of a span is very long.

Reviewed By: DurhamG

Differential Revision: D19780536

fbshipit-source-id: dce0d3777409c32b0752db51341a572addb823ea
2020-02-06 15:46:53 -08:00
Xavier Deguillard
6ea4bb998e revisionstore: move memcache initialization to a background thread
Summary:
As initializing the memcache client takes ~0.7s, let's move it to a background
thread as to not impact Mercurial startup time. This diff uses ArcSwap in
order to reduce the overhead of the very common read paths as much as possible.
Using Mutex or RwLock instead would have caused unecessary contention.

Reviewed By: DurhamG

Differential Revision: D19518693

fbshipit-source-id: 886e9b86813fda6ff005ccce99659890026f643a
2020-02-05 14:01:54 -08:00
Xavier Deguillard
b8947748b5 pyrevisionstore: expose the memcache client to python
Summary:
This allows the Python code to build a memcache client and build ContentStore
and MetadataStore with it.

Reviewed By: DurhamG

Differential Revision: D19518694

fbshipit-source-id: d932fd5223ccfdf37db69cbb54a11a6571312709
2020-02-05 14:01:54 -08:00
Xavier Deguillard
920ea27a17 revisionstore: add memcache client
Summary:
This enables an in-process memcache client for the Rust
ContentStore/MetadataStore. For now, this implementation is lacking several
necessary optimization:
 - Start-up time is always slowed down by ~0.7s, the initialization will be
   moved to a background thread
 - Writing data to memcache is blocking and will be moved to a background
   thread too.
 - Prefetching data does a roundtrip to memcache for every key, batching
   memcache APIs will be added.

Compared to the existing hg_memcache_client, this implementation is both
significantly shorter and do not exhibit some of the pathological behavior of
having to flush the indexedlog for every fetched blob when used in Eden.

Reviewed By: DurhamG

Differential Revision: D19518696

fbshipit-source-id: 4725447d13e7eddd9586135c2511e13ddb921771
2020-02-05 14:01:53 -08:00
Jun Wu
7316c4cc22 cpython-ext: add a way to wrap Rust Write object into a Python object
Summary:
The library already has a way to wrap a Python object into a Rust object that
exposes the Rust Read/Write interface. This is the reverse direction for
the Write interface.

The initial intention is to expose Rust stdout as described in D19702533.
However, I found Python's `sys.stdout.buffer` also enforces utf-8 encoding
on Windows (unless PYTHONLEGACYWINDOWSSTDIO is set). So Python's
stdout actually behaves similarly with Rust's stdout on Windows and is okay
to use. That said, it's still useful to have this abstraction, for streampager [1]
integration.

[1]: https://github.com/markbt/streampager/

Reviewed By: sfilipco

Differential Revision: D19716127

fbshipit-source-id: ba39898122561d9a49b7080ee95d7c940540eb40
2020-02-04 18:41:13 -08:00
David Tolnay
34a520536a Update rustfmt and reformat fbsource
Summary:
```
$ tools/third-party/rustfmt/rustfmt --version
rustfmt 1.4.11-nightly (1838235 2019-12-03)
```

Reviewed By: zertosh

Differential Revision: D19704678

fbshipit-source-id: fe8707e964495e76746edcb8b68e34fc1411f52a
2020-02-04 17:14:27 -08:00
Jun Wu
3e0b781197 py3: only use binary stdin/stdout/stderr
Summary:
Drop stdoutbytes/stdinbytes. They make things unnecessarily complicated
(especially for chg / Rust dispatch entry point).

The new idea is IO are using bytes. Text are written in utf-8 (Python 3) or
local encoding (Python 2). To make stdout behave reasonably on systems not
using utf-8 locale (ex. Windows), we might add a Rust binding to Rust's stdout,
which does the right thing:
- When writing to stdout console, expect text to be utf-8 encoded and do proper decoding.
- Wehn writing to stdout file, write the raw bytes without translation.

Note Python's `sys.stdout.buffer` does not do translation when writing to stdout console
like Rust's stdout.

For now, my main motivation of this change is to fix chg on Python 3.

Reviewed By: xavierd

Differential Revision: D19702533

fbshipit-source-id: 74704c83e1b200ff66fb3a2d23d97ff21c7239c8
2020-02-03 18:26:57 -08:00
Mateusz Kwapich
e2dc4e8014 diff: update for py3
Summary:
All diff functions are (bytes, bytes) -> bytes to preserver
the original file encoding.

Because of that I had to add ui.writebytes output function that accepts
bytes for terminal output.

Reviewed By: farnz

Differential Revision: D19656673

fbshipit-source-id: b9a1e4361e825fc8c2313e8402c2bbe00f490dd4
2020-01-31 13:00:23 -08:00
Mark Thomas
18ecb01b8a mutationstore: update tests so that user is now a string
Summary: D19649887 changed mutation entry users to be strings.  Update the tests accordingly.

Reviewed By: simpkins

Differential Revision: D19656792

fbshipit-source-id: fcff677099dc0200130bf30eadaaf66822c6139c
2020-01-30 19:54:45 -08:00
Mark Thomas
914607cac7 cpython-ext: add PyPath for references to paths
Summary: `PyPath` is to `PyPathBuf` as `Path` is to `PathBuf` and `str` is to `String`.

Reviewed By: quark-zju

Differential Revision: D19647995

fbshipit-source-id: 841a5f6fea295bc72b00da028ae256ca38578504
2020-01-30 17:33:35 -08:00
Durham Goode
b567c16b60 py3: make mutation markers 'user' utf8
Summary: Username as utf8, so let's make mutationmarker treat them as such.

Reviewed By: xavierd

Differential Revision: D19649887

fbshipit-source-id: 3f8b2db434a57ee8ee3017de8d925c19a2002b20
2020-01-30 15:22:24 -08:00
Mark Thomas
13b7a759a2 cpython-ext: add PyNone, a marker struct for functions that can only return None
Summary: Add `PyNone`.  This is a marker struct that indicates that a python function can only return `PyNone`.

Reviewed By: xavierd

Differential Revision: D19644338

fbshipit-source-id: f846b146237ebf7de996177494934fec662cde0f
2020-01-30 12:28:38 -08:00
Mark Thomas
6b8042662a cpython_ext: rename PyPath to PyPathBuf
Summary:
`PyPath` is the type that owns the data.  Rename it to `PyPathBuf` for analogy
with `PathBuf` and `RepoPathBuf`, and to allow us to introduce a reference type
named `PyPath`.

Reviewed By: xavierd

Differential Revision: D19643797

fbshipit-source-id: 56d80fea5677f7223e967b0723039d1763b26f68
2020-01-30 11:06:24 -08:00
Jun Wu
c5dd6829c7 cpython-ext: add more utilities for PyPath
Summary:
Make the type easier to use. Namely, the treestate bindings want PyPath <->
bytes since treestate internally uses bytes.

Reviewed By: xavierd

Differential Revision: D19635357

fbshipit-source-id: 37d1889b5da1d7f3869bb7820de0219b87b71a8b
2020-01-30 08:27:33 -08:00
Mark Thomas
1e63f205f4 rust-cpython: allow compilation for both py2 and py3
Summary:
Set up the `cpython-ext` and `hgcommands` libraries so that they can compile
against py2 and py3 versions of rust-cpython.  Make py2 the default so
that cargo test still works.

Reviewed By: singhsrb

Differential Revision: D19615656

fbshipit-source-id: 3403e7077deb3c0a9dfe0e3b7d4f4ad1da73bba3
2020-01-28 20:17:20 -08:00
Adam Simpkins
ad957e7803 py3: update Rust hgcommands code to pass argv to python as Str
Summary:
Update the Rust hgcommands code to pass the command line arguments into the
Python logic as `Str` types, so that this will be Unicode `str` objects when
using Python 3.

Reviewed By: xavierd

Differential Revision: D19596739

fbshipit-source-id: 7cdfd44a1c4ce8b0f86d20b634d9b27eab822b2d
2020-01-28 15:58:37 -08:00
Jun Wu
ed3a2b2247 cpython-ext: add missed types dep
Summary: This is incorrectly removed due to a bad rebase / merge.

Reviewed By: DurhamG

Differential Revision: D19607801

fbshipit-source-id: a6ee7a3f184ff1882eb1f1513f7fed74a7108727
2020-01-28 13:50:14 -08:00
Jun Wu
8703970cea py3: update Cargo.toml to make py3 buildable
Summary:
This makes `make hg3` work. It requires cleaning up the `build` directory when
switching between py2 and py3 build, which will be fixed later.

Reviewed By: DurhamG

Differential Revision: D19604824

fbshipit-source-id: 060ff313420126a5dba935c4451b45dc9af45f13
2020-01-28 13:39:38 -08:00
Xavier Deguillard
d087e39a34 pypathmatcher: use PyPath instead of PyByte
Reviewed By: DurhamG

Differential Revision: D19592136

fbshipit-source-id: 5db6ca629cd920d52ffbf7f10963c44c8f7b203d
2020-01-28 12:40:48 -08:00
Adam Simpkins
beff6fdea7 py3: add additional from() conversion methods for Str
Summary: Add methods to convert to a `Str` object from `String` and from `Vec[u8]`

Reviewed By: xavierd

Differential Revision: D19596743

fbshipit-source-id: 6499f7f1b8329f4d14ce8179a41ed46982a85c8e
2020-01-28 12:25:39 -08:00
Mark Thomas
4fe02f3607 bindings: update to rust-cpython 0.4
Summary: Update to the new version of rust-cpython.  This supports `list.append`, so make use of it.

Reviewed By: xavierd

Differential Revision: D19590905

fbshipit-source-id: 03609d4f698ae8e4380e82b8144caaa205b4c2d4
2020-01-28 10:46:33 -08:00
Stefan Filip
5720b9a2a1 py3/pymanifest: convert path types from PyBytes to PyPath
Reviewed By: xavierd

Differential Revision: D19594134

fbshipit-source-id: e8532a125aa2ed4b7740e669ad572fcbb327692f
2020-01-28 10:29:11 -08:00
Xavier Deguillard
283b120bb6 pyconfigparser: use PyPath instead of PyByte
Summary:
Also, add a util::path::strip_unc function that is more clear than the
normalize_for_display

Reviewed By: DurhamG

Differential Revision: D19595961

fbshipit-source-id: 330bcb708bf64320a3562d79db685d6cb1e14f16
2020-01-28 10:14:14 -08:00
Xavier Deguillard
61aaf894c3 pyrevisionstore: use PyPath instead of PyBytes
Summary:
For Python3 compatibility, let's use PyPath, it hides the logic of encoding for
Python2

Reviewed By: DurhamG

Differential Revision: D19590024

fbshipit-source-id: 7bed134a500b266837f3cab9b10604e1f34cc4a0
2020-01-28 10:01:50 -08:00
Jun Wu
373073df47 py3/rust: cpython-ext: optionally show Python error traceback
Summary:
This is optional, but it helps investigating Python errors chained with other
Rust errors.

For example:

  error.RustError: failed fetching from store (, cc38739855a7f356b4a2aaac0a0a858fd646e6bf)
  Caused by:
    TypeError()
    Traceback (most recent call last):
      File "scm3/edenscm/hgext/remotefilelog/contentstore.py", line 53, in get
        chain = self.getdeltachain(name, node)
      File "scm3/edenscm/hgext/remotefilelog/contentstore.py", line 91, in getdeltachain
        chain = self._getpartialchain(name, node)
      File "scm3/edenscm/hgext/remotefilelog/contentstore.py", line 125, in _getpartialchain
        return store.getdeltachain(name, node)
    TypeError

Without this diff there is only "TypeError()" without the traceback.
This can be turned off by unsetting RUST_BACKTRACE.

Reviewed By: markbt

Differential Revision: D19581173

fbshipit-source-id: 74605b78146b6b1c9ddd5ad720dcd19ff73908a8
2020-01-27 18:56:10 -08:00
Xavier Deguillard
24ae9f9592 cpython-ext: fix python3 compile error
Summary: The format_err is used in shared code too, we need to import it.

Reviewed By: quark-zju

Differential Revision: D19592591

fbshipit-source-id: bd344bf3c295473f4647235a98432d11c9678bf9
2020-01-27 16:58:42 -08:00
Xavier Deguillard
33ea1763ce cpython-ext: add a PyPath type
Summary:
This will be used as an argument to the Rust bindings when using paths. This
type is either a PyBytes in Python2 and uses the various encoding function to
convert into a String, or a PyUnicode in Python3 with no encoding change.

Reviewed By: farnz

Differential Revision: D19587890

fbshipit-source-id: 58903426585693193754691fe3c756b9097b35f6
2020-01-27 16:50:14 -08:00
Xavier Deguillard
e512c370fd py3/rust: cpython-ext: set ob_size on raw PyObject
Summary:
Without this, Rust code using the feature (ex. lz4, used by lz4revlog) will
panic.

Reviewed By: sfilipco

Differential Revision: D19581188

fbshipit-source-id: b499449df4fede27fe66cf8e5af57e8347a0dd48
2020-01-27 16:50:14 -08:00
Xavier Deguillard
f16bb04977 py3/rust: cpython-ext: support memoryview as PySimpleBuf
Summary: Otherwise we got RustPanic when clindex or dagindex reads mmapped changelog.i.

Reviewed By: sfilipco

Differential Revision: D19581189

fbshipit-source-id: 3ee74a1bd000d58272551ae404dcfe7f957bb2c0
2020-01-27 16:50:13 -08:00
Xavier Deguillard
ad58839ca1 py3/rust: use Str type in cliparser and hgcommands
Reviewed By: sfilipco

Differential Revision: D19581176

fbshipit-source-id: e92e5c2538537ec16da25a9819c9a097a24a4d6e
2020-01-27 16:50:13 -08:00
Xavier Deguillard
1c697dbc49 py3/rust: cpython-ext: add Str type
Summary: This converts to bytes on Python 2, but unicode on Python 3.

Reviewed By: markbt

Differential Revision: D19581180

fbshipit-source-id: 0de9056a01ae30810a72352387de5a940b37d7ab
2020-01-27 16:50:13 -08:00
Xavier Deguillard
789d2b5fbb py3/rust: types: add AsRef<str> for RepoPath
Summary:
In a future diff, I have RepoPath in Rust and want to send unicode path to
Python.

Reviewed By: sfilipco

Differential Revision: D19581184

fbshipit-source-id: 73a03707a6bdae4a497a8ee2c14314aa4ffefb6d
2020-01-27 16:50:12 -08:00
Kostia Balytskyi
6bf47a9f5a hgtime: fix corner case of date range parsing
Summary: The docs promise that both `<` and `>` bounds are inclusive, so let's fix that.

Reviewed By: markbt

Differential Revision: D19580840

fbshipit-source-id: 13770a8e9351fe62f58e9a701b526a167752543a
2020-01-27 09:37:00 -08:00
Stefan Filip
d78982a6e8 dag: move iddag to own file
Summary:
Separate Segment and IdDag in two individual files. This is preparation for
refactoring IdDag to be more flexible in terms of storage. That will probably
involve moving stuff out of IdDag into a new file that deals with the storage
abstractions.

Reviewed By: quark-zju

Differential Revision: D19559127

fbshipit-source-id: b3b9b18e2653157e69148b1f29292a57b30016ec
2020-01-24 15:49:54 -08:00
Jun Wu
52af332c28 renderdag: add tests showing how orders affect rendering
Summary:
I wrote it to understand how renderdag draws the same graph with different
orders. It seems useful for future optimization that tries to reduce the number
of columns. So let's check it in.

Reviewed By: xavierd

Differential Revision: D19440713

fbshipit-source-id: 8bc580799f6b24c87886d5ac306020f50bb694e5
2020-01-23 20:50:56 -08:00
Jun Wu
29c749ef7d dag: add fuzz tests on the octopus DAG
Summary: This gives us some confidence about octopus merge handling.

Reviewed By: DurhamG

Differential Revision: D19540726

fbshipit-source-id: e84de74aecae54429483edd185d39fd1bd858f87
2020-01-23 17:58:51 -08:00
Jun Wu
8ac97da54e bindag: make TestContext more flexible
Summary:
TestContext uses ParentRevs. That limits parents to at most 2.

Use a type parameter so we can opt-in Vec<usize> for octopus merge support,
at the cost of worse cache efficiency.

Reviewed By: DurhamG

Differential Revision: D19540727

fbshipit-source-id: f9e8de151b7b296fd6f0fd89be9de2b8996634c7
2020-01-23 17:58:51 -08:00
Jun Wu
df23791d08 bindag: add some octopus examples
Summary:
Our new algorithms support octopus merges. However there were no tests using
octopus merges. This diff adds a simple one.

Reviewed By: DurhamG

Differential Revision: D19540728

fbshipit-source-id: 8411024f0b7e27c2ebfabbe1935496124c25df7b
2020-01-23 17:58:51 -08:00
Jun Wu
494bdae7cc dag: add a fuzz test about range algorithm
Summary:
The test runs the old and new algorithm and compares their result.  This is more
interesting than using random numbers, since the fuzzing framework will try to
explore new code paths.

Reviewed By: sfilipco

Differential Revision: D19511576

fbshipit-source-id: e9a2066769b54a60bb92643e5715f91a6fccbcb5
2020-01-23 17:58:50 -08:00
Jun Wu
78ea96cb9d bindag: port range algorithm from hg
Summary:
The ported algorithm will work as a comparison to verify dag's range
implementation.

Reviewed By: sfilipco

Differential Revision: D19511574

fbshipit-source-id: 589353d6e6c91b8d6707c977eeb8558ac733b525
2020-01-23 17:58:50 -08:00
Stefan Filip
f5280b75e9 thrift: update thrift generated files
Summary: Commit updates after having ran `make local`

Reviewed By: xavierd

Differential Revision: D19543278

fbshipit-source-id: 00fdc3ebec32e8a3d706b89402dc91f771984c3c
2020-01-23 16:06:51 -08:00
Xavier Deguillard
b6589bde84 revisionstore: prefetch takes &[Key] instead of Vec<Key>
Summary: This can prevent potential moves and clones on the caller of prefetch.

Reviewed By: quark-zju

Differential Revision: D19518697

fbshipit-source-id: 63839fc3f4bb9ca420e290eabaffb481a3584f7b
2020-01-23 08:57:22 -08:00
Jun Wu
fff2cb833f dag: add a fuzz test about gca algorithm
Summary:
The test runs the old and new algorithm and compares their result.  This is more
interesting than using random numbers, since the fuzzing framework will try to
explore new code paths.

This cannot run on stable Rust yet. I added a README for how to run it.

Reviewed By: sfilipco

Differential Revision: D19504096

fbshipit-source-id: 621da02c50a771dee9932f9d7a407cb1f412a543
2020-01-22 19:30:50 -08:00
Jun Wu
af85f4ff3b bindag: add a way to get a subdag of a parsed bindag
Summary: Sometimes the graph is too large. Provide a way to slice it.

Reviewed By: sfilipco

Differential Revision: D19511575

fbshipit-source-id: 504317d6894764043b23ea49dcf09c8cdea96961
2020-01-22 19:30:49 -08:00
Jun Wu
b5482f8976 bindag: add utilities for easier testing
Summary:
As we plan to test the dag crate with some other DAG implementation,
add a convenient structure that setups both DAG implementations.

Reviewed By: sfilipco

Differential Revision: D19503371

fbshipit-source-id: 3e9933ad37301bfac36eb1af6d82b4298af778b6
2020-01-22 19:30:49 -08:00
Jun Wu
7d11508dfa bindag: port GCA algorithm from hg
Summary:
The ported algorithm will work as a comparison to verify dag's gca
implementation.

Reviewed By: sfilipco

Differential Revision: D19503373

fbshipit-source-id: f5253db89fbcdc2fd02f3fdaa0796e24338b1fba
2020-01-22 19:30:49 -08:00
Jun Wu
a98d288938 bindag: apply smallvec optimization
Summary:
This is similar to D17581248. It will make the old linear-scan algorithm (which
will be added later) about 5x faster.

Reviewed By: sfilipco

Differential Revision: D19503372

fbshipit-source-id: c65d7217e7b144603dadd57f54a5e70f513c8e51
2020-01-22 19:30:48 -08:00
Jun Wu
64271f24ba dag: move bindag from benches to a separate crate
Summary: This allows bindag to be used outside benches.

Reviewed By: sfilipco

Differential Revision: D19503374

fbshipit-source-id: 131061f7d1d28125875a86afc330dbb9634249cf
2020-01-22 19:30:48 -08:00
Xavier Deguillard
524c85d711 revisionstore: limit delta chain to 1000 entries
Summary:
We've seen a case where a datapack contains a circular delta chain, causing
Mercurial to fall into a infinite loop when trying to read it. Let's fail when
the chain is over 1000 entries.

Reviewed By: quark-zju

Differential Revision: D19458453

fbshipit-source-id: bfa503f7807122eca72cf94418abda161dafa41c
2020-01-21 08:50:59 -08:00
Jun Wu
124e275377 dag: make NameDag use MultiLog for data consistency
Summary: This ensures IdMap and IdDag are guaranteed consistent in the storage layer.

Reviewed By: DurhamG

Differential Revision: D19432658

fbshipit-source-id: 00f1a9b4c747baa1f14d78c31d925682317463b4
2020-01-17 21:49:57 -08:00
Jun Wu
907aadcdd7 indexedlog: add MultiLog
Summary: The MultiLog holds multiple Logs and can atomically sync them.

Reviewed By: DurhamG

Differential Revision: D19432659

fbshipit-source-id: 6ac7dc6f74468f985c6a6b0c419e888722a80037
2020-01-17 21:49:57 -08:00
Jun Wu
5aa872599c indexedlog: make ScopedDirLock remember which directory gets locked
Summary: This makes it possible to do extra sanity checks.

Reviewed By: DurhamG

Differential Revision: D19443783

fbshipit-source-id: 254c2537a6aadd25a67c5e48a768187ce65aa686
2020-01-17 21:49:56 -08:00
Jun Wu
2c9a3b9c61 indexedlog: use a method to create LogMetadata
Summary: This makes the code overall shorter.

Reviewed By: DurhamG

Differential Revision: D19443552

fbshipit-source-id: abd1db227830a88549c7eca1cfd08b67c4914518
2020-01-17 21:49:56 -08:00
Zeyi (Rice) Fan
a431e64e4e eden: periodically refresh content store to give it chances to release mapped files
Summary:
As reported by JT, EdenFS may hold the file descriptor of mapped pack files too long even when it is deleted by external processes, thus taking more disk spaces.

This diff fixes this problem by making EdenFS periodically rescan the pack files.

Reviewed By: chadaustin

Differential Revision: D19395439

fbshipit-source-id: 4bfd6a7ac13dceb3099d2704d62b3825433aff4b
2020-01-17 15:00:01 -08:00
Jun Wu
733961456f indexedlog: fix try_clone external key buffer handling
Summary:
In both cases (clone with or without dirty content), the external key buffers
used by indexes should be re-created, since mem_buf location has changed.

Reviewed By: DurhamG

Differential Revision: D19432657

fbshipit-source-id: fe6f76e7ccfd16ccd2f5c1d89866687a3503603e
2020-01-17 03:58:00 -08:00
Jun Wu
85d7253edd indexedlog: add "shared metadata" abstraction
Summary: This allows us to build the multi-log structure.

Reviewed By: DurhamG

Differential Revision: D19431786

fbshipit-source-id: e0e09b0d5a73d293a80626924b74ddf2ce6d6fa3
2020-01-17 03:57:59 -08:00
Jun Wu
36e70264b6 indexedlog: avoid log::OpenOptions::create_in_memory
Summary: Migrate to the new API. This is more compatible with the next change.

Reviewed By: DurhamG

Differential Revision: D19431788

fbshipit-source-id: dba1a88fd003d17fc1774b0cec9157d32c5acdb0
2020-01-17 03:57:59 -08:00
Jun Wu
de5772cb30 indexedlog: replace Log's dir with a new GenericPath type
Summary:
This allows us to incrementally abstract away parts related to filesystem from
Log. For example, instead of using std::fs to access filesystem directly, use
methods on the GenericPath instead.

Right now the main motivation is to add support for "multi-logs" sharing a single meta
file. To do that, methods reading and writing metas are moved to the
GenericPath type.

This also unifies the open functions. Now OpenOptions::create_in_memory can be
private. Empty in-memory Logs can be opened via `open(())`.

Reviewed By: DurhamG

Differential Revision: D19431784

fbshipit-source-id: dbdf94f60261e09f131c6fdd9fe3b99242a28af5
2020-01-17 03:57:59 -08:00
Jun Wu
bf133b797e indexedlog: expose io::ErrorKind in the wrapped error type
Summary: This makes an upcoming change easier.

Reviewed By: DurhamG

Differential Revision: D19432656

fbshipit-source-id: 65adc883c3c3937aa7022196b83c75715b820721
2020-01-17 03:57:58 -08:00
Jun Wu
24393faaeb indexedlog: move log::tests to a separate file
Summary: The goal is to split log.rs to smaller modules so it's easier to reason about.

Reviewed By: DurhamG

Differential Revision: D19431787

fbshipit-source-id: 23b8346c461da322e7a2240b8224e6194867aaee
2020-01-17 03:57:58 -08:00
Jun Wu
38db4b5cdc indexedlog: move log::OpenOptions to standalone modules
Summary:
The goal is to split log.rs to smaller modules so it's easier to reason about.

OpenOptions has many dependencies. Move the most complicated one (repair) to
another standalone module.

Reviewed By: DurhamG

Differential Revision: D19431783

fbshipit-source-id: 28c3af7cd6e19588bfe9c395e2648244faf3179d
2020-01-17 03:57:58 -08:00
Jun Wu
f6d3fd3d39 indexedlog: move log::LogMetaData to a standalone module
Summary: The goal is to split log.rs to smaller modules so it's easier to reason about.

Reviewed By: DurhamG

Differential Revision: D19431785

fbshipit-source-id: b9a598b3018267ff6e43ef57dc807cb4e880467c
2020-01-17 03:57:57 -08:00
Xavier Deguillard
1e809fc681 revisionstore: allow building a {Content,Metadata}Store without local stores
Summary:
In order to avoid pathological linkrevfixup after pushing a change to a
pushrebase repo, we need to be able to fetch history data that is already
present locally from the server. Since the Rust stores will always check
whether the data is present locally before fetching it, we would not be
fetching anything, causing the pathological linkrevfixup to kick in.

By allowing the stores to be built without a local component, the prefetch
function will not find the local history, and will thus be able to fetch it
properly.

Reviewed By: DurhamG

Differential Revision: D19412619

fbshipit-source-id: 421c59c63634ead7f98e6ba89da0532067f7b412
2020-01-16 09:41:40 -08:00
Xavier Deguillard
d89eab8078 pyconfigparser: use String as arguments instead of PyBytes
Summary:
In the cpython bindings, the Rust String can take both PyBytes and PyUnicode
strings, which is perfect for Python3 compatibility as string literals are
PyBytes in Python2 and PyUnicode in Python3. The return values are kept as
Bytes for now as changing this is a much larger change in itself.

Other approaches tried:
 - Using PyUnicode as input/output: an extremely large codemod had to be done,
   with very little benefits
 - Using String as output: since we do have some configs that are unicode, on
   the Python side, the output might either be bytestrings or unicode, leading
   to weird bugs.

Reviewed By: DurhamG

Differential Revision: D18650466

fbshipit-source-id: aebdf30590dcae40b7df2787e5ece88e2ec9395c
2020-01-16 09:31:45 -08:00
Xavier Deguillard
c92c9d5a03 revisionstore: add some comment explaining the order of local/shared store
Summary:
Mercurial assumes some specific ordering, let's comment about this to make sure
we don't re-order and introduce subtle bugs.

Reviewed By: quark-zju

Differential Revision: D19394352

fbshipit-source-id: 0f9e02d2c6addf040311a54b8161b06bbeaa6be9
2020-01-15 15:55:11 -08:00
Xavier Deguillard
3030ad82b0 remotefilelog: fix remotefilelog.cachepath difference for Rust ContentStore
Summary:
The error message and the exception type are slightly different between the
Rust and Python ContentStore. For now, let's just fix this up manually.

Reviewed By: quark-zju

Differential Revision: D19394350

fbshipit-source-id: e432094a9dfcf605568a1890c0303b733e98d203
2020-01-15 15:55:10 -08:00
Genevieve Helsel
a7d3210d2e normalize hg status error between python and rust
Summary: The rust telemetry wrapper and the python code were not in sync in how they presented errors to the user on failure. This makes them identical in message and error code (however the "abort" from python is red in color).

Reviewed By: quark-zju

Differential Revision: D19313848

fbshipit-source-id: d5b95e8d5130f20fc85c6664933ee18702f7cc5e
2020-01-15 15:30:52 -08:00
Jun Wu
94c383327a dag: make NameDag support buffered changes
Summary:
Previously, NameDag only supports writing directly to disk. That is not easy
for hg to use, since hg knows commits before bookmark names during pull, while
the API requires both commits and bookmarks (to decide which commits belong to
the "master" group) at the same time.

The old API also does not match other storage layers like indexedlog and
metalog, where writes are buffered until explicitly flushed.

This diff adds the "write-in-memory" (named "add_heads"), and "flush" APIs to
make NameDag easier to use. Under the hood, it just copies the "added" portion
of the DAG, and re-use the old API ("build", renamed to "add_heads_and_flush")
to do the job.

Note: The buffered changes pattern cannot be used in IdDag, since Ids might
have to be re-assigned on flush. But NameDag does not expose the raw Ids in its
"normal" interface, so it's fine to reassign Ids on flush.

Reviewed By: markbt

Differential Revision: D19405474

fbshipit-source-id: 75e5e5815c78c3577a0138f48185f6c4b5a80891
2020-01-15 14:02:07 -08:00
Jun Wu
3a74b82a39 dag: rename NamedDag to NameDag
Summary:
This is more consistent with the name "IdDag", because both "Name" and "Id" are
noun.

Reviewed By: singhsrb

Differential Revision: D19405473

fbshipit-source-id: a3b7546dd0ddcae5d9ed562838bd6be10a47063c
2020-01-15 14:02:06 -08:00
Jun Wu
3c5766d59e dag: rename Dag to IdDag
Summary: This is more consistent with the name "NamedDag".

Reviewed By: singhsrb

Differential Revision: D19405472

fbshipit-source-id: f7023307acaf96bf77c9fa9704dcaf6fc59b56f2
2020-01-15 14:02:06 -08:00
Jun Wu
85a89c6f96 dag: add git commit graph as test fixture
Summary:
The git repo has a lot of merges. So it's interesting as a testing graph.
The file was generated by:

    git clone https://github.com/git/git --bare
    hg --config extensions.hggit= clone git.git git-hg
    hg --cwd git-hg debugbindag -r 'all()' -o git.bindag

Reviewed By: DurhamG

Differential Revision: D19411825

fbshipit-source-id: 0fec8a16786dc8e6790986bca5c62a76276aa942
2020-01-15 11:03:52 -08:00
Mark Thomas
56cd3eadb5 renderdag: connect vertical lines for non-merge commits
Summary:
This changes the pattern commonly seen in smartlog:

    ╷ o  e7f5f529  ...
    ╭─╯
    │ o  a1b2773d ...
    ╭─╯
    o  ecbb4eaa
    ╷

to:

    ╷ o  e7f5f529  ...
    ├─╯
    │ o  a1b2773da ...
    ├─╯
    o  ecbb4eaa
    ╷

The change only applies to commits with a single parent. Vertical lines from
merge commits stay disconnected intentionally. For example:

    │ │ o  I
    │ │ │
    │ o │      H
    ╭─┼─┬─┬─╮
    │ │ │ │ o  G
    │ │ │ │ │

This makes it more obvious that `H` has 5 parents while `I` only has 1 parent -
`I` does not "borrow" `H`'s parents.  This problem does not exist if `H` only
has 1 parent.

Reviewed By: quark-zju

Differential Revision: D19407687

fbshipit-source-id: 1046c8e2309f50e3f1620ed21f1b10573759a5f8
2020-01-15 06:57:51 -08:00
Stefan Filip
a2ca6bb7dd manifest-tree: move private test function to lib.rs
Summary:
Moving `store_element` and `get_hgid` from testutil.rs to lib.rs in order to
get rid of some buck build warnings. They are only used in lib.rs.

Reviewed By: markbt

Differential Revision: D19350343

fbshipit-source-id: 081783d753425daaae6fadc38589da5e4e8b745d
2020-01-14 11:49:53 -08:00
Stefan Filip
2c9fa6054d manifest: update make_tree to be public
Summary:
It makes sense for manifest-tree to expose a testutil version for building a
TreeManifest. No reason for the testutil function to be limited to the crate.
The only issue is that `make_tree` is to generic so we rename the function to
`make_tree_manifest`.

Reviewed By: markbt

Differential Revision: D19350351

fbshipit-source-id: 06fe04d99bf820d3e81417f5a5514aa4b0d282e6
2020-01-14 11:49:53 -08:00
Stefan Filip
887b776fe1 manifest: add testutil in core package
Summary:
Types that are defined by the manifest core crate should have test utilities
defined in the same crate.

This is motivated by various warning in the buck build.

Reviewed By: markbt

Differential Revision: D19350350

fbshipit-source-id: a6a7c3fb54b465aa09a28ff8b70b49a355b328fc
2020-01-14 11:49:52 -08:00
Jun Wu
684351a6a9 edenfs-client: fix warnings
Summary:
Fix warnings about unused imports, and fix Cargo.toml to use well-defined
versions.

Reviewed By: singhsrb

Differential Revision: D19386232

fbshipit-source-id: 4eb7478dfb4f34e8338d5090b3ba7225bfcb008f
2020-01-13 18:14:46 -08:00
Durham Goode
1c573e7569 workingcopy: add error handling and reporting to rust walker
Summary:
Previously, if the rust walker encountered any issues with individual
files it would stop and throw an exception or in some cases panic. To replace
the python walker we need to handle errors more gracefully, so let's add logic
for reporting which files had issues and what those issues were.

This also contains a minor one-line fix to prevent the walker from traversing
directories that contain a '.hg' directory.

Reviewed By: quark-zju

Differential Revision: D19376540

fbshipit-source-id: ede40f2b31aa1e856c564dd9956b9f593f21cf27
2020-01-13 15:26:30 -08:00
Jun Wu
3895af590a edenfs-client: use clidispatch IO and Error types
Summary:
This simplifies the error handling and makes it more compatible with things that
might capture the output, including the Python testing framework.

Reviewed By: DurhamG

Differential Revision: D19325642

fbshipit-source-id: 53de8b9a8294219e2b8e62831dce236841bd4cbb
2020-01-13 14:23:34 -08:00
Jun Wu
12eaf82291 hgcommands: add edenfs status fast path
Summary: Make `hg status` use the EdenFS Thrift path, similar to the telemetry wrapper.

Reviewed By: DurhamG

Differential Revision: D19325641

fbshipit-source-id: 14777a252d7cb433316511a2a1f1a6649e9cb020
2020-01-13 14:23:34 -08:00
Jun Wu
b9e7f6a995 cliparser: support shared flag definition
Summary:
Make it possible to define reusable flags like walkopts, templateopts that can
be shared among commands.

Reviewed By: DurhamG

Differential Revision: D19325640

fbshipit-source-id: 89a9ad0e59672b1f8cfa22daaacd3c75d8cff61a
2020-01-13 14:23:33 -08:00
Jun Wu
7e9aaf9da5 lib: add edenfs-client
Summary:
Add a crate aiming for talking to EdenFS via Thrift.

This is mostly ported from telemetry. There are lots of things that can be
cleaned up like moving the command portion to hgcommands, or using a better
abstracted rendering layer for outputting, etc. But for now, I just ported
the code as-is.

Reviewed By: DurhamG

Differential Revision: D19325639

fbshipit-source-id: ac70315ed166024857a90aab070e4d521f3e3b58
2020-01-13 14:23:33 -08:00
Jun Wu
b907d121db thrift-types: add a crate exposing EdenFS thrift interface
Summary:
The main motivation is to replace (super slow) Python Thrift serialization.
It also makes it possible to port the native "status" command from telemetry
wrapper to the main executable.

The crate is made to be built with either buck or cargo in fbcode environment:
- Cargo build uses `build.rs` to re-generate (checked-in) Thrift code.
- Buck build uses TARGETS to build Thrift code on the fly.

Reviewed By: DurhamG

Differential Revision: D19325638

fbshipit-source-id: 1d3287d8d6c674065500929827fa13ba9d448708
2020-01-13 14:23:32 -08:00
Mark Thomas
d06144e957 renderdag: test width correctly matches rendered width
Summary:
Enhance the tests to test whether the width of the graph matches the width that
was given by the `width` method before `next_row` was called.

The width we are interested in is the number of cells the string would occupy,
not the string length, so use the `unicode-width` crate to determine this.

Reviewed By: quark-zju

Differential Revision: D19282571

fbshipit-source-id: d9852c4c9e0f76c78db047f0da5dd34723a62a2a
2020-01-13 10:22:43 -08:00
Mark Thomas
aae60162eb renderdag: switch to . for ancestor lines
Summary: Switch from `:` to `.` for the ancestor lines.  I think they look better.

Reviewed By: quark-zju

Differential Revision: D19371721

fbshipit-source-id: e18f1a62e23620a82007e2c377607a0a61623830
2020-01-13 10:22:43 -08:00
Mark Thomas
f395961e33 renderdag: add additional glyph sets for box-drawing
Summary:
Box-drawing characters with curves aren't reliably renderable on Windows.  Add
a "square" glyph set that uses right-angle characters.  These characters are
available in cp437 and cp850, so should be available on most Windows systems.

For completeness, add a "dec" glyph set that uses DEC line drawing characters,
for use in old terminals like xterm.

Reviewed By: quark-zju

Differential Revision: D19371722

fbshipit-source-id: 35887243cceab66c702e2b5278b572f77946805f
2020-01-13 10:22:42 -08:00
Stefan Filip
98fc586755 types: add helper method for generating repo paths in testutil
Summary:
Using RepoPath::arbitrary() has a high chance or producing a lot of different
top level directories. The helper method added by this change,
`generate_repo_paths` simulates directories having various files and several
directories.

Reviewed By: quark-zju

Differential Revision: D19321366

fbshipit-source-id: 5c1aec78b6157f3cbea3d0673b29b3a676de88c0
2020-01-10 11:12:41 -08:00
Stefan Filip
403e4ae594 manifest-tree: add basic benchmark
Summary: Basic benchmark that covers insert, remove, finalize and iterations.

Reviewed By: quark-zju

Differential Revision: D19204893

fbshipit-source-id: 3aafbf6d94a558d2fd4ae59190f6046dc7c5885e
2020-01-10 11:12:41 -08:00
Stefan Filip
526b029faf manifest: move TestStore to the testutil module
Summary:
The TestStore can be leveraged by methods outside the manifest-tree crate.
Anything that wants to instantiate a manifest for test purposes could
leverage it.

Reviewed By: quark-zju

Differential Revision: D19204894

fbshipit-source-id: 4ac42d09855c70f829feefc6c71dcdbf7211cae3
2020-01-10 11:12:41 -08:00
Jun Wu
6249b1713d hgtime: fix timezone handling converting local date
Summary:
If the user specifies a timezone-less date, ex. `2000-7-1`, it's incorrect
to use the same offset of `Local::now()`, because of daylight savings.

Fix it by going through proper type conversion.

Reviewed By: DurhamG

Differential Revision: D19179800

fbshipit-source-id: 6fd433757e7f6e4baf4625944d4061548a436f11
2020-01-09 11:51:30 -08:00
Jun Wu
459ae48cef indexedlog: stabilize log::tests::test_repair_and_delete_content
Summary:
Previously, between 100 to 500 runs, the test is likely to fail:

  ---- log::tests::test_repair_and_delete_content stdout ----
  thread 'log::tests::test_repair_and_delete_content' panicked at 'assertion failed: `(left == right)`
    left: `"Verified first 141 entries, 999 of 1400012 bytes in log\nReset log size to 999\nIndex \"c\" is incompatible with (truncated) log\nRebuilt index \"c\""`,
   right: `"Rebuilt metadata\nVerified first 141 entries, 999 of 1400012 bytes in log\nReset log size to 999\nRebuilt index \"c\""`', src/log.rs:3414:9

The "Rebuilt metadata" part is missing.

This is because after D17742003, the end of `meta` is the `epoch`, which is a
random number. The test is rewriting the end of `meta` to corrupt it. That
has a chance to not corrupt the file. Change the test to corrupt the fixed
header of meta so it can always corrupt the file and therefore stabilize
the test.

Reviewed By: xavierd

Differential Revision: D19325087

fbshipit-source-id: 53d46c8a6cb9771582f8f37e27f185d303fde0ce
2020-01-09 08:48:47 -08:00
Jun Wu
064068169b dag: rename slice to VertexName
Summary:
Per discussion, we decided to use "VertexName" as the struct name for things
like commit hashes, or the string names in tests (or the Mozilla DAG in tests).
Therefore, introduce a dedicated VertexName type and repalce all callsites to
use it.

`bytes::Bytes` is used so copying the `VertexName` is somewhat considered cheap.

This adds some overhead copying slices (and `Bytes` has some overhead). It
regresses the "building segments" benchmark from 673ms to 773ms, which seems
okay given the cleaner interface.

Reviewed By: markbt

Differential Revision: D19154905

fbshipit-source-id: 4c6d4eca67c11c10ed5f21999ccdc3f1b01695e8
2020-01-08 21:35:28 -08:00
Jun Wu
232d5047cc dag: fix incorrect gca calculation
Summary: `gca(X) = gca(roots(X))`, not `gca(heads(X))`.

Reviewed By: sfilipco

Differential Revision: D19154907

fbshipit-source-id: 8d54992fbf5b03ce7ab2fbcc689c79351bf75947
2020-01-08 18:46:08 -08:00
Jun Wu
c6c4baa07e dag: redefine Id::{MIN, MAX} using Group
Summary:
Redefine Id::{MIN, MAX} using Group so Id::MAX.group() is a valid group.
Previously Id::MAX has an invalid group.

Reviewed By: sfilipco

Differential Revision: D19154908

fbshipit-source-id: 6108f02da786eb228a7d1b772ca5134c1ebb6f6b
2020-01-08 18:46:07 -08:00
Jun Wu
c4cd7df3b2 dag: extract logic calculating universal commits to a method
Summary:
Address review feedback on D18670160.

This also fixes the definition of "universal" a bit - the new code no longer
assumes there are only one head in the master group.

Reviewed By: sfilipco

Differential Revision: D19154906

fbshipit-source-id: 4fbf080354318300ce09dc7fba2dcc5f942b6308
2020-01-08 18:46:07 -08:00
Jun Wu
bcaff9062c dag: rebuild non-master id and segments if needed
Summary:
Add utilities to rebuild non-master ids and segments if necessary.

The `NamedDag` structure ensures indexes have 1:1 mapping.

Reviewed By: sfilipco

Differential Revision: D18838995

fbshipit-source-id: 4a48b183c182bd5e6336a2ca4973c36091fbbfd8
2020-01-08 18:46:07 -08:00
Jun Wu
28380e0272 dag: fix logic about avoiding unnecessary high level segments
Summary:
The check of "is this high level segment necessary or not" should be done
before dropping the last segment. This fixes an issue discovered in D18838992.

Reviewed By: sfilipco

Differential Revision: D19154904

fbshipit-source-id: fb4c83c39d66215bae168ad98e5cf78de91cc5a3
2020-01-08 18:46:06 -08:00
Jun Wu
d4eddd6b11 dag: use NamedDag in tests
Summary:
NamedDag is the high-level interface that is more interesting to test.

High-level segments are changed subtly because NamedDag syncs them to disk
first.

Reviewed By: sfilipco

Differential Revision: D18838992

fbshipit-source-id: c6a557e0a44a1d24320ea4a9e4283262f6f30f67
2020-01-08 18:46:06 -08:00