Commit Graph

708 Commits

Author SHA1 Message Date
Stefan Filip
abbb2a5f7a revisionstore: use testutil in ancestors
Summary: testutil everywhere

Reviewed By: quark-zju

Differential Revision: D14877546

fbshipit-source-id: cb1534cf925a633370dd60a3191855d14bbaa84e
2019-04-14 19:56:46 -07:00
Jun Wu
4ca0b2c778 indexedlog: move macros to a separate module
Summary:
The internal rustfmt linter suggests wrong autofixes for the `impl_offset!`
macro. That's noisy for every diff touching `index.rs`. Silence it by moving
macros to a separate file.

To be consistent, `define_error!` is also moved.

Differential Revision: D14885746

fbshipit-source-id: d1a518e631f80d6d7945f1ea3c2e4d18e1c799ca
2019-04-11 12:51:26 -07:00
Xavier Deguillard
5ddf39f788 remotefilelog: add an indexedlog contentstore
Summary:
While the Rust code can read/write content out of an indexedlog, the Python
code cannot. For now, all the writes will be done in Rust, and the Python code
will only be able to read from it.

Reviewed By: quark-zju

Differential Revision: D14894330

fbshipit-source-id: 5c1698d31412bc93e93dabb93be106a2ef17d184
2019-04-11 12:07:58 -07:00
Xavier Deguillard
10ae96292e asyncpacks: add indexedlogdatastore
Summary:
As The IndexedLogDataStore will be used in hg_memcache_client, it needs to be
used in async code, and thus needs an async wrapper.

Note: I should probably rename the crate to "asyncrevisionstore" :)

Reviewed By: quark-zju

Differential Revision: D14881362

fbshipit-source-id: 203ce50954d99899715b32f85e6118e757578ece
2019-04-11 12:07:58 -07:00
Jun Wu
59231d903d indexedlog: implement read-only fast paths for sync
Summary:
In case there are nothing to write, `Log` and `LogRotate` can take
a fast path that does not take directory locks.

Differential Revision: D14885450

fbshipit-source-id: 4d72d5a3e33b7371880ad31f8bc43ed31c03797f
2019-04-10 20:59:56 -07:00
Jun Wu
29fa45c973 indexedlog: rename some flush() to sync()
Summary:
The `flush()` function does two things: read and write. It's not just writing
data. Rename it to `sync` to clarify.

`Index::flush` is unchanged because although it might read new data, the new
data is not visible. Calling `Index::flush` without dirty changes does not
cause visible (queriable) changes to the index.

`FlushFilter` is unchanged because it is coupled with the write path. It is
not to filter reading.

The old name is kept temporarily until all pending code gets committed and
we can do a codemod.

Differential Revision: D14885451

fbshipit-source-id: 3aed3b741e5e8f09b611ddcc25930a6fdf71706c
2019-04-10 20:59:56 -07:00
Jun Wu
a4322ca904 logrotate: make writable_log private
Summary:
The `writable_log` API can be misused to "flush" a Log, bypassing the check
about whether it should be rotated or not.

The real need of `writable_log` is to get accesses to indexes on the "writable"
(or "latest") log. Therefore let's just expose that instead.

Practically, the only use case of querying the index on the "latest" log is to
make sure dependent content are written to a same Log. That also requires a
"flush_filter" to be provided. Therefore add an assertion about it.

Differential Revision: D14866022

fbshipit-source-id: f6c07a498597b6f0f07d7cc3130e9033ba8b9be4
2019-04-10 19:50:01 -07:00
Jun Wu
f364cd1420 logrotate: add flush_filter
Summary:
Introduce the "flush filter" that can replace content to be written.
This would be useful to make sure delta chains are self-contained.

For LogRotate, flush_filter is trigger not only when the log file
was modified, but also when rotation happens,

Differential Revision: D14866024

fbshipit-source-id: f417200d3ae573e9ac82985ad6afd082412b358d
2019-04-10 19:50:01 -07:00
Jun Wu
2e300dba51 indexedlog: add a flush_filter function to Log
Summary:
The flush filter allows mutating entries being flushed. It can be used to avoid
inserting duplicated data.

Differential Revision: D14866023

fbshipit-source-id: ecf6cf60a0a97cf8110ef9c957e7e3bbab5855fc
2019-04-10 19:50:00 -07:00
Jun Wu
0cd1d8ce9d indexedlog: error out if the primary log does not match metadata
Summary:
Previously the code allows the "log" file to be longer than the metadata,
intended to allow advanced usecases that replaces the "meta" file to
get a read-only view in the past.

That implies we trust the length of "log" file. But it's in theory easy to mess
up - when appending to the "log" file, the process might be killed.

Data integrity is first priority. Therefore let's just error out if the file
length does not match the metadata. To support read-only views in the past,
we can use potentially use file names other than "meta" or support in-memory
metadata instead.

Differential Revision: D14866025

fbshipit-source-id: bbf0061a6448375a2de06fbf31f2b9838c749be0
2019-04-10 19:50:00 -07:00
Xavier Deguillard
a38ac46869 revisionstore: add an indexedlog backed content store
Summary:
Packfiles are proving complex in several situation in order to perform well.
For instance, repack are required to keep common operation from spending most
of their time in scanning and iterating over the filesystem. In fact, most of
the pain point with packfiles is caused by their immutability: once written,
they can no longer be updated.

IndexedLog on the other hand can be updated in place, and therefore do no
require repacks and thus do not exhibit some of the pathological behavior that
packfile are showing.

As a first step, let's add a simple content store backed by indexedlog.

Reviewed By: quark-zju

Differential Revision: D14790070

fbshipit-source-id: 44f766db6a08169971f87a38246873c6e53c3233
2019-04-10 10:34:34 -07:00
Arun Kulshreshtha
de2605bf4d edenapi: make data and history batch sizes separately configurable
Summary: Make the batch size of data and history requests independently configurable, since data responses are typically much larger than history responses (since the former contains actual file data whereas the latter is only metadata).

Differential Revision: D14859686

fbshipit-source-id: c87c31f3e6611a55ae712e7f0ed9bb392d31a579
2019-04-09 17:00:01 -07:00
Arun Kulshreshtha
1bb8d63b74 edenapi: split MultiDriver into its own module
Summary: Move the code for managing a curl::Multi session into its own submodule to avoid cluttering the main client file.

Reviewed By: quark-zju

Differential Revision: D14855344

fbshipit-source-id: 8c93959774c3bc03d2620012d1665228fbcb6681
2019-04-09 14:59:44 -07:00
Arun Kulshreshtha
9fdd71e4df edenapi: use curl multi interface
Summary: Use the curl multi interface to fetch multiple batches of files or history entries concurrently.

Differential Revision: D14718547

fbshipit-source-id: c5a740c7e9106b719e825540f8182be31a72bae7
2019-04-09 14:59:44 -07:00
Stefan Filip
a61980ab9b asyncpacks: update asyncmutablehistorypack to testutil
Summary: testutil everywhere

Differential Revision: D14716080

fbshipit-source-id: 197cce4f64443a7a065010dd9ff8da32f548d496
2019-04-08 16:21:08 -07:00
Stefan Filip
becc32004a asyncpacks: update asynchistorypacks.rs to testutil
Summary: testutil everywhere

Differential Revision: D14716079

fbshipit-source-id: c83388f5248bf6afd9c2b6af87dcd8f6b0b850e1
2019-04-08 16:21:08 -07:00
Stefan Filip
f958833800 asyncpacks: update asyncdatapack.rs to testutil
Summary: testutil everywhere

Differential Revision: D14713053

fbshipit-source-id: 26fcdea580dd45280bf2f1725dcdb6ab8948465f
2019-04-08 16:21:08 -07:00
Stefan Filip
d6ad49db5b manifest: migrate to types::testutil node
Summary: migration

Differential Revision: D14660306

fbshipit-source-id: 71df6814d93f8b9f814aedaa4ceb558a8b69cdf6
2019-04-08 16:21:08 -07:00
Stefan Filip
1e2816f7c6 types: add testutil module to help writing tests
Summary:
Building test objects can be tedious using various of our bottom bytes.
This diff addresses that issue by adding helper functions in a new module
in the types crate.

Handling this case could be improved in rust.

Differential Revision: D14660307

fbshipit-source-id: a866c1f3ede60ba1b87eb17d35817b8a8d7674a4
2019-04-08 16:21:07 -07:00
Arun Kulshreshtha
6705e2d120 bindings: allow choice between edenapi backends
Summary: Allow users to configure which HTTP client backend to use for the Eden API via the `edenapi.backend` config option. Valid options are `curl` and `hyper`, with `curl` being the default.

Reviewed By: quark-zju

Differential Revision: D14657871

fbshipit-source-id: 7a9972d2380fbbd5ed62d1accae764dc03ca4c29
2019-04-05 17:34:14 -07:00
Arun Kulshreshtha
6c8c87dea1 edenapi: add curl-based client
Summary:
Add a new Eden API client based on libcurl (via the rust-curl crate). This should help us work around issues with Hyper.

This implementation is based on curl's "easy" API, and is intentionally naive. I intend to update it to use curl's "multi" API to send several concurrent HTTP requests per operation in a later diff.

Differential Revision: D14656756

fbshipit-source-id: 1f71074506844104f0f3237023b38317a7f41979
2019-04-05 17:34:14 -07:00
Jun Wu
12b98e1e96 indexedlog: use failure for error handling
Summary:
Failure makes it easier to chain errors, and backtraces. Use it.

There is probably still room for improvement, by chainning errors and avoiding
exposing low-level errors for APIs, and/or provide more context in error
messages. But it should be already much better than before.

Differential Revision: D14759305

fbshipit-source-id: b1d3a8ec959dde575f06533ea9e4cd0757057051
2019-04-05 12:17:28 -07:00
Jun Wu
dbfad715b8 logrotate: reduce max_log_count to u8 range
Summary:
Practically there are many issues with a large max_log_count:
- The directory scan would be slower.
- The index would be slower.

Let's reduce it to u8 range to address the issues. This also makes the
directory name short.

Differential Revision: D14717896

fbshipit-source-id: d39f008abe576991e14d444c37a049a6132df507
2019-04-03 22:16:33 -07:00
Jun Wu
cf82cb6340 indexedlog: replace atomicwrites with tempfile
Summary:
Some tests added by upcoming diffs were timing out while they don't seem that
expensive. I tracked it down to the use of `fsync` in atomicwrites.

In our case, we don't need `fsync`. `fsync` is useful for making sure the order
of file writes is desired even in case of system crash. For example, making
sure the "primary" log file is written before writing the "meta" file.

That's too expensive (esp. on filesystems like ext4) for our usecase.
Indexedlog is designed to make sure data corruption can be detected, and there
can be a "reasonable" way to recover (ex. by deleting all indexes, scanning
through entries and re-inserting them in a new log), not to fight against OS
crashes.

`cargo bench` change on a btrfs filesystem:

Before:

  index flush                    42.570 ms
  log flush                       7.712 ms

After:

  index flush                    36.485 ms
  log flush                       1.609 ms

Differential Revision: D14759304

fbshipit-source-id: 66b95d10040cf1480367b767811dfabee5e27ffe
2019-04-03 22:16:33 -07:00
Jun Wu
7cb1663ae0 indexedlog: migrate to Rust 2018
Summary: Used `cargo fix --edition`. Removed some `mut`s according to rustc warnings.

Differential Revision: D14718308

fbshipit-source-id: 94e3c3f8e47143ede767fe883fdb5e9602b12854
2019-04-03 22:16:33 -07:00
Zeyi Fan
9277617d1e fix double free in cdatapack
Summary:
Mercurial recently started to generate empty pack files (`0x01`). This will cause this check to fail:

diffusion/FBS/browse/master/fbcode/scm/hg/lib/cdatapack/cdatapack.c;2c4197d003ed906dd8eaf70fbb04aa53440ce681$314-319

This will subsequently result as a double-free error between these two:

**In `error_cleanup`**

diffusion/FBS/browse/master/fbcode/scm/hg/lib/cdatapack/cdatapack.c;2c4197d003ed906dd8eaf70fbb04aa53440ce681$387-389

**In `close_datapack`**

diffusion/FBS/browse/master/fbcode/scm/hg/lib/cdatapack/cdatapack.c;2c4197d003ed906dd8eaf70fbb04aa53440ce681$401

This diff will fix this bug.

Differential Revision: D14759374

fbshipit-source-id: 06f192513a935740c2142b5a2baac87a28903496
2019-04-03 21:13:13 -07:00
Xavier Deguillard
e106c73ddf revisionstore: do not create an empty datapack/historypack
Summary:
Zeyi realized that empty packfiles were problematic for the cdatapack code.
While its code should be fixed, having empty packfiles lying around is
unecessary anyway, so let's not write them.

Reviewed By: fanzeyi

Differential Revision: D14760942

fbshipit-source-id: a128eedaf79a6388a3c7142399715bb4eb96a2ae
2019-04-03 20:43:06 -07:00
Xavier Deguillard
39e66964f4 revisionstore: limit history repack memory usage
Summary:
While datapack repack is fairly inexpensive in memory and mostly limited to the
number of entries in its index, a historypack repack needs to keep both the
data and the index in memory. It appears that the overhead of doing so is a big
factor in repack taking a lot of memory as a resulting 100MB histpack would use
about 1.2GB of RAM. Extrapolating the numbers, a resulting 4GB histpack would
need 48GB, which is enough to put a devserver in a swapping state, and worse
for laptops. Limiting the historypack size to 400MB should cap the RAM usage to
a bit under 5GB.

Reviewed By: kulshrax

Differential Revision: D14757839

fbshipit-source-id: b08bf01bddad01f1cae9cc67d4bd3d637c0bf0db
2019-04-03 16:56:09 -07:00
Arun Kulshreshtha
1f09251a85 edenapi: make hyper client fields private
Summary: Now that the Hyper client is contained in a single module, its fields do not need to be crate-public.

Reviewed By: sfilipco

Differential Revision: D14733274

fbshipit-source-id: aa5c2f4fd9fdf6e686da1fed6300e8cf8f7e5dbc
2019-04-03 14:14:27 -07:00
Arun Kulshreshtha
1472ff1efa edenapi: refactor Builder into Config
Summary:
Rather than having a `Builder` struct that knows how to build just one kind of Eden API client, let's have a common `Config` type that can be potentially passed to the constructor of several different client implementations.

This will allow the same config code to be re-used across different client types, as seen later in this stack.

Differential Revision: D14656757

fbshipit-source-id: 883ffd2dc0302ebe08960f079c113e2d0da2d2ca
2019-04-01 20:15:38 -07:00
Arun Kulshreshtha
48ab61d83d edenapi: move EdenApi implementation to client module
Summary: Move the implementation of the `EdenApi` trait for the current Hyper-backed Eden API client to the same module as the client, so that we can have several side-by-side implementations, each contained in their own respective module.

Differential Revision: D14656758

fbshipit-source-id: a0d5fa36ec346c40466df559ccc900b14a7c542f
2019-04-01 20:15:38 -07:00
Jun Wu
acc0aaea7d indexedlog: migrate from tempdir to tempfile
Summary: `tempdir` is deprecated. Use `tempfile` instead.

Differential Revision: D14690867

fbshipit-source-id: f5df77708078538a0832bd941726f280ed97355f
2019-04-01 17:16:18 -07:00
Jun Wu
1e59d25e17 indexedlog: add OpenOptions::index
Summary:
Make it a bit easier to define indexes.

Before:

    OpenOptions::new()
      .index_defs(vec![IndexDef::new("first-byte", |_| {
          vec![IndexOutput::Reference(0..1)]
      })])

After:

    OpenOptions::new()
      .index("first-byte", |_| vec![IndexOutput::Reference(0..1)])

Reviewed By: kulshrax

Differential Revision: D14690357

fbshipit-source-id: 6e80a91f4279f960d9f41369c228e79023b5164c
2019-04-01 17:16:17 -07:00
Jun Wu
88fb64a6ee indexedlog: use monospace font for links to code
Summary:
The Rust stdlib uses this pattern.  This is done by:

  sed -i 's/\[\([A-Z][a-zA-Z:]*\)\]/[`\1`]/g' *.rs

Unfortunately it seems only rustdoc nightly can linkify things correctly.

More context: https://github.com/rust-lang/rust/issues/43466

Reviewed By: kulshrax

Differential Revision: D14689887

fbshipit-source-id: ba2b5968bdaad06f39dc43962430906ee80692fd
2019-04-01 17:16:17 -07:00
Jun Wu
7c74b40bc1 logrotate: de-dup logic in OpenOptions
Summary:
rotate::OpenOptions is a superset of log::OpenOptions. Change the code to reuse
logic in log::OpenOptions as much as possible.

Reviewed By: kulshrax

Differential Revision: D14689888

fbshipit-source-id: a6958723c49f9d41b03100f01283a8c3fb37a1ab
2019-04-01 17:16:17 -07:00
Jun Wu
277d25b581 indexedlog: move checksum_type to OpenOptions
Summary:
The motivation of this is, LogRotate might copy dirty (non-flushed) entries
from one Log to another, and it cannot preserve the checksum type for those
entries. There are 2 solutions:

- Make `iter_dirty` return checksum type.
- Make checksum type known by Log directly.

The second choice provides a simpler public API. `append_advanced` can be
removed, then `iter_dirty` is still consistent with `iter`. Therefore this
change.

Differential Revision: D14688174

fbshipit-source-id: 09e07d64c886a5ce9bc48dce8e29d036af1c0381
2019-04-01 17:16:16 -07:00
Jun Wu
8fc9742997 indexedlog: make Log own OpenOptions
Summary: A later diff adds another field to OpenOptions that Log needs access to.

Differential Revision: D14688171

fbshipit-source-id: 33170a2b74639ba0fd8a9c86207d840fb6427580
2019-04-01 17:16:16 -07:00
Jun Wu
341b3dad6f logrotate: make flush delete old logs
Summary: This is the final piece to make space usage bounded.

Differential Revision: D14688179

fbshipit-source-id: a6e0058b9022789fcf036c4427d29eab19144b53
2019-04-01 17:16:16 -07:00
Jun Wu
b1b92b8def logrotate: make flush handle "latest" change
Summary:
If "latest" pointer has changed, we should write to the new "latest" Log,
instead of the stale one.

Differential Revision: D14688180

fbshipit-source-id: eab8df8ddb8f311e472361ecc2b1bc4155f2aba4
2019-04-01 17:16:15 -07:00
Jun Wu
c23508dcd9 indexedlog: add Log::iter_dirty
Summary:
This API iterates entries that are in-memory only. It is useful to extract
entries and store them elsewhere.

Differential Revision: D14688178

fbshipit-source-id: 6ace51d859ba6886aeb94689f6c45162b9c6958e
2019-04-01 17:16:15 -07:00
Jun Wu
f38bbfd92e logrotate: partially implement flush
Summary: Implement the basic flush logic. Missing bits are listed as TODO items.

Differential Revision: D14688177

fbshipit-source-id: 3613009ec2c216398af6eaff44487a20ceeb97ef
2019-04-01 17:16:15 -07:00
Jun Wu
cd1750f06b indexedlog: make Log::flush return the new file size
Summary:
The file size will be used to decide whether the Log needs "rotate" in upcoming
changes.

Reviewed By: kulshrax

Differential Revision: D14688169

fbshipit-source-id: b273abcc870b96650d2c76e6e742a3141ce48f13
2019-04-01 17:16:15 -07:00
Jun Wu
ec90e8db57 logrotate: implement append and lookup
Summary:
These methods just delegate to `Log` structures. Unfortunately, the key has to
be copied so it can be used by the iterator to query remaining logs.

Differential Revision: D14688172

fbshipit-source-id: fd581f7256031a0622ec0533c84daaab89f9bb82
2019-04-01 17:16:14 -07:00
Jun Wu
aecd9edae9 logrotate: implement open
Summary: Implement the open logic.

Reviewed By: kulshrax

Differential Revision: D14688170

fbshipit-source-id: df3d39040e2268b3eddb131b2ae1b1f76d3e4311
2019-04-01 17:16:14 -07:00
Jun Wu
f160f31cde logrotate: add a LogRotate structure
Summary:
Start implementing the "log rotate" idea by markbt. It is similar to
logrotate, with plain text log files replaced by indexedlog. This
implementation also avoids renaming, which can be troublesome on Windows,
by just increasing the number (ex. to rotate "1/", "2/", create "3/", and
delete "1/", without renaming "2/").

The main use case would be LRU key-value cache on disk.

Reviewed By: kulshrax

Differential Revision: D14688176

fbshipit-source-id: 3bf7917e06386ebf85d8d6deeea850c58f4875e8
2019-04-01 17:16:14 -07:00
Jun Wu
a7371c96d3 indexedlog: add create option to Log::OpenOptions
Summary:
One of the future need is to open a `Log` without creating it by default. The
newly added `create` option can be disabled to prevent that.

This also changes the code path so we no longer take a directory lock
unconditionally during `open`.

Differential Revision: D14688173

fbshipit-source-id: 88795d5637a1a5135d4014434b2cf828540c0333
2019-04-01 17:16:13 -07:00
Jun Wu
6555afa621 indexedlog: add Log::OpenOptions
Summary:
One of the upcoming changes is to add an option to avoid creating Log on demand
at open time. To avoid `open` being too complicated, add an `OpenOptions` struct.
This is consistent with `index` and `std::fs`.

Differential Revision: D14688175

fbshipit-source-id: bb7f1556a32f1f7b15c64a23c5aee7493dd40ce6
2019-04-01 17:16:13 -07:00
Stefan Filip
02851845a9 manifest: Fix skip_subtree on Leaf
Summary:
This diff fixes the behavior of `skip_subtree` when called on a Leaf. The bug is
that the path is not correctly handled in this case. The name of the file
continues to stay in the path resulting in incorrect path names for all
subsequent calls to `path()`.
The high level perspective  is that `skip_subtree` is a no-op in a Leaf node.
To fix, clarify the behavior and improve readability of the code we ad a new
state that handles poping elements from the path.

Durham noticed this bug when reviewing D14347655.

Reviewed By: quark-zju

Differential Revision: D14654557

fbshipit-source-id: 625278366e492a3048dddc44f9234a06d6928b7e
2019-04-01 11:51:16 -07:00
Jun Wu
64db96a4b7 indexedlog: make IndexDef clone-able
Summary:
It's hard to clone a `Fn`. But `fn` can be cloned. Change the API to use `fn`
instead.

Cloning `IndexDef` allows the same index definition to be used by multiple
Logs. It's used by upcoming diffs.

Differential Revision: D14688181

fbshipit-source-id: 6fda03a5f744dc90ee5d7ad3f36c243602f33510
2019-03-30 08:59:13 -07:00
Jun Wu
69a6c18747 indexedlog: normalize benchmarks to use 204800 entries
Summary:
This makes it easier to compare benchmark results between abstractions.

A sample of the result is listed below. Comparing to radixbuf, which is highly
optimized and less flexible, indexedlog is about 10x slower on insertion, and
about 3x slower on lookup.

indexedlog:

  index insertion (owned key)    90.201 ms
  index insertion (referred key) 81.567 ms
  index flush                    50.285 ms
  index lookup (memory)          25.201 ms
  index lookup (disk, no verify) 31.325 ms
  index lookup (disk, verified)  46.893 ms

  log insertion                  18.421 ms
  log insertion (no checksum)    12.106 ms
  log insertion with index      110.143 ms
  log flush                       8.783 ms
  log iteration (memory)          6.444 ms
  log iteration (disk)            6.719 ms

raidxbuf:

  index insertion                11.874 ms
  index lookup                    8.495 ms

Differential Revision: D14635330

fbshipit-source-id: 28b3f33b87f4e882cb3839c37a2a11b8ac80d3e9
2019-03-27 16:29:58 -07:00
Jun Wu
1568a30c9a indexedlog: add a benchmark inserting entries without checksum
Summary:
This is just a trivial test case showing the overhead of xxhash.

  log insertion                  18.359 ms
  log insertion (no checksum)     7.835 ms

Differential Revision: D14635329

fbshipit-source-id: adc2629c0c41aaab48d29d467849e4d96eb01c51
2019-03-27 16:29:58 -07:00
Jun Wu
08c42a9e06 radixbuf: fix cargo bench
Summary: Change the code to work with newer `rand` crate.

Reviewed By: kulshrax

Differential Revision: D14635328

fbshipit-source-id: 007f6749f2eab781a7dcf7d49b19aff1c81089b4
2019-03-27 16:29:58 -07:00
Arun Kulshreshtha
a57430330a types: fix typo in test name
Summary: Fix typo

Reviewed By: singhsrb

Differential Revision: D14647373

fbshipit-source-id: 4f0c0f4a2a411d50ca7c5414cd1a5a9995ee6690
2019-03-27 12:52:21 -07:00
Arun Kulshreshtha
7036097135 revisionstore: add convenience function to add many entries to a historypack
Summary: Add a function that adds all of the entries from a given iterator into a HistoryPack.

Differential Revision: D14647767

fbshipit-source-id: 29a71b37da86125e14135c40c279bfc8a454b568
2019-03-27 12:38:39 -07:00
Mark Thomas
6d84ff8825 mutation: create mutation store entries for local commits
Summary:
Computing mutation entries for all local commits is expensive when there are
lots of local draft commits.  Ideally we would have an indexed changelog that
would make these lookups fast, but until we have that, put entries in the
mutation store for these commits to take advantage of the fast lookup there.

Reviewed By: quark-zju

Differential Revision: D14566782

fbshipit-source-id: cc3a05715337a510a65d8ff436c59d16d0f0447e
2019-03-27 04:49:12 -07:00
Jun Wu
09b26ed273 indexedlog: resolve a warning
Summary: `std::fs` is only needed for Windows. Do not "use" it on *nix systems.

Reviewed By: sfilipco

Differential Revision: D14634779

fbshipit-source-id: 9fd9a29ae27e13f00b4adbc83a74bd92a1b1658c
2019-03-26 21:19:46 -07:00
Jun Wu
1348cf45f5 indexedlog: make fields in IndexDef private
Summary:
Change fields in IndexDef to private. Provide a public constructor method and
switch users to use that instead. This makes it possible to change the IndexDef
struct in the future (ex. having extra optional fields about whether the index
is backed by radix tree or something different).

Differential Revision: D14608955

fbshipit-source-id: 62a413268d97ba96b2c4efd2ce67cd4fa0ff4293
2019-03-26 21:19:46 -07:00
Zeyi Fan
cbda7a4748 Update to Rust 1.33.0
Summary: Update Rust toolchain to 1.33.0 with fixes to make our code compatible with 1.33.0.

Reviewed By: Imxset21, kulshrax

Differential Revision: D14608312

fbshipit-source-id: 2d9cf7d01692abaed32f9adffa0e5eb51cfacb4f
2019-03-26 15:43:17 -07:00
Mark Thomas
fdd103b31b configparser: avoid environment race in configparser tests
Summary:
The configparser tests `hg::tests::test_basic_hgplain` and
`hg::tests::test_hgplainexcept` set different values for `HGPLAIN` and
`HGPLAINEXCEPT`.  Since the tests run in parallel and use the same environment,
one of the tests may fail if they run at the same time.

For these tests, create a mutex for the environment and lock it for the
duration of the test, ensuring these tests do not interfere with each other.

Reviewed By: jsgf

Differential Revision: D14615394

fbshipit-source-id: 9f123668d93223655514db2ae34b05354a6b578c
2019-03-26 07:33:23 -07:00
Mark Thomas
ac64b2c858 mutationstore: use new location for ChaChaRng
Summary:
The ChaCha RNG has moved from `rand` to `rand_chacha`.  Use the new location to
prevent the deprecation warning.

Reviewed By: quark-zju

Differential Revision: D14596397

fbshipit-source-id: e082369d4cf2d4ab42eb37df83a2627b937dcf62
2019-03-25 14:07:58 -07:00
Mark Thomas
05fdc3c37f mutationstore: apply latest rustfmt
Reviewed By: quark-zju

Differential Revision: D14596405

fbshipit-source-id: 43cc962db9cbc06ea0bfcf4041cd191ef1f1bc2e
2019-03-25 14:07:57 -07:00
Jun Wu
b6631c103d indexedlog: fix tests on Windows
Summary:
Windows disallows rewriting or truncating mmaped files. Fix the tests by
either dropping the mmap, or skipping the test.

Reviewed By: sfilipco

Differential Revision: D14572119

fbshipit-source-id: dccafdc66db3830c2919232d899ba31365120066
2019-03-22 11:37:35 -07:00
Jun Wu
5e2df5b977 configparser: be more permissive about include paths
Summary:
Windows has 2 kinds of paths - the UNC path (starting with `\\?\`), and paths
most people use (ex. `C:\foo\bar`). The former is more powerful (reserved names
like `nul` can be used), and is the "canonicalized" form as seen by Rust stdlib.

The UNC paths are stricter, though. `/` is not treated as `\` automatically,
`.` and `..` are considered illegal. That is, trying to canonicalize a UNC path
with `..` or `.` will result in an error.

It's possible to get `.`, or `..` into part of an UNC path, by using the
`PathBuf::join` API provided by the Rust stdlib. That is, a legal UNC path stored
in `PathBuf` can become illegal by `join`ing a non-UNC path.

I'm not sure what's the most "clean" fix. Perhaps using two different types to
represent UNC path and non-UNC path in stdlib? But that's definitely not a trivial
(or even practical) change.

For now, just teach the config parser to "friendly try again" by stripping the
UNC prefix and re-canonicalize paths. So it can handle `.` or `..` used by
`%include`.

Reviewed By: sfilipco

Differential Revision: D14568119

fbshipit-source-id: 2a55faa945c8d03574fd56e82d946c9ef7f0138f
2019-03-22 11:37:35 -07:00
Jun Wu
8021c26449 indexedlog: avoid filesystem race on log creation
Summary:
The `load_or_create_meta` function is subject to filesystem races. Solve it by
always taking a lock.

This hurts performance a little bit. But `open()` should not be in a hot loop.
So it should probably be fine.

Reviewed By: sfilipco

Differential Revision: D14568122

fbshipit-source-id: d9b28555ab94252da4717de709b780b361e1dda7
2019-03-22 11:37:35 -07:00
Jun Wu
c74e894aa1 indexedlog: make directory locking work on Windows
Summary:
On Windows it's impossible to open (2) a directory. Therefore add a utility
function that creates `lock` file automatically on Windows and open that file
instead.

Reviewed By: sfilipco

Differential Revision: D14568117

fbshipit-source-id: bc7ae7046be654560c38fbd98ec4dd58c071b1dc
2019-03-22 11:37:35 -07:00
Jun Wu
9361d54b04 indexedlog: make sure meta file is created
Summary:
Previously, `load_or_create_meta` could return without actually creating the
meta file. That leads to problems when `load_or_create_meta` is called a
second time via `flush()`, it rewrites the primary file incorrectly. On Windows,
it will fail to rewrite the primary file.

Fix it by actually writing a meta file before returning.

Reviewed By: sfilipco

Differential Revision: D14568118

fbshipit-source-id: da3ad42bf48a923d732b1719839ca1953bd2b06c
2019-03-22 11:37:35 -07:00
Jun Wu
7e77bf81f0 indexedlog: rustfmt files
Summary: As the title.

Reviewed By: sfilipco

Differential Revision: D14568120

fbshipit-source-id: 7f2d8fb31e8f6c57976996e41f7a61503f5873a5
2019-03-22 11:37:35 -07:00
Arun Kulshreshtha
54e5f56277 types: add types for batched eden api requests
Summary: This diff adds serializable types representing batch requests for the Eden API. Just like the response types, these types must live the Mercurial's `types` crate so that they can be shared between the client and server.

Reviewed By: quark-zju

Differential Revision: D14573332

fbshipit-source-id: c31d718e6a97829ce1acfc25b8853dc3761323a7
2019-03-21 23:11:53 -07:00
Arun Kulshreshtha
4c1f11a751 types: fix mock node values
Summary: The mock values for the `Node` type are intended to have hashes that consist of a repeated digit (e.g., `1111111111111111111111111111111111111111`). However, since the bytes were specified using a single hex digit instead of two, the hashes were actually like `0101010101010101010101010101010101010101`. This diff fixes the values so they look as expected.

Reviewed By: quark-zju

Differential Revision: D14557546

fbshipit-source-id: 23651d70b9715d2fb77db162f689b87d9d43e5a2
2019-03-21 14:29:20 -07:00
Stefan Filip
e3b9873beb types: correctly import lazy_static in cargo
Summary: Fixes cargo test.

Reviewed By: quark-zju

Differential Revision: D14546435

fbshipit-source-id: daae0035871202fa3d221e11b0ea66199ded39d2
2019-03-20 19:56:14 -07:00
Stefan Filip
06704f2db2 radixbuf: add ignore marker to documentation blocks
Summary:
https://doc.rust-lang.org/rustdoc/documentation-tests.html#syntax-reference

Rust will treat an indentation of 4 or more spaces as a fenced code block and
attempt to run it as a docblock test

Reviewed By: singhsrb

Differential Revision: D14543987

fbshipit-source-id: 92f78e9e052befba0bd3eea80ac171f651f2fced
2019-03-20 19:56:14 -07:00
Stefan Filip
9c90eb2e9c radixbuf: cargo fmt
Summary: Formatting

Reviewed By: singhsrb

Differential Revision: D14543986

fbshipit-source-id: 5aae3a6166c315872102ab90d87d46d782682bc8
2019-03-20 19:56:14 -07:00
Stefan Filip
5e147ca3b7 commitcloudsubscriber: updates tests.rs
Summary:
The main issue is that cargo test fails preventing adding sandcastle
configuration that would run these tests on CI.

Reviewed By: singhsrb

Differential Revision: D14543988

fbshipit-source-id: c299148cce01316fad872b9cf8e15dea6633da48
2019-03-20 19:56:14 -07:00
Stefan Filip
ca6052e70c revisionstore: update rand package
Summary: This fixes the build in test mode.

Differential Revision: D14533840

fbshipit-source-id: baa40261f17cdc8881d99a52a7f5cbd1ff66307a
2019-03-20 19:56:14 -07:00
Xavier Deguillard
bee64d1535 asyncpacks: make the Metadata mandatory when adding to a asyncdatapack
Summary:
Similarly to the previous change let's make the asyncmutabledatapack force the
Metadata to be present.

Reviewed By: sfilipco

Differential Revision: D14443510

fbshipit-source-id: 26f851e8d38297dcc37410f0df6a69083531d516
2019-03-19 16:24:50 -07:00
Xavier Deguillard
1c1b1fadc7 revisionstore: make the Metadata mandatory when adding data to a datapack
Summary:
Now that mutablepacks can only create v1 packfile, we can force the Metadata to
not be optional. The main reason for doing this is to avoid issues where LFS
data is stored without its corresponding LFS flag. This can cause issue down
the line as LFS data will be intepreted as is, instead of being interpreted as
a pointer to the LFS blob.

Reviewed By: sfilipco

Differential Revision: D14443509

fbshipit-source-id: 9e7812017fc1356072278496406648f935024f92
2019-03-19 16:24:50 -07:00
Xavier Deguillard
10373e38e2 revisionstore: Force mutabledatapack to be created with v1
Summary:
The v0 doesn't support flags like whether the data is actually an LFS pointer. Let's simply
forbid creating them.

Reviewed By: quark-zju

Differential Revision: D14443512

fbshipit-source-id: 6ffa2e8fda2b2baba0aae53e749bc9248594a134
2019-03-19 16:24:50 -07:00
Xavier Deguillard
a3cee67af5 revisionstore: ignore more errors in repack_packs
Summary:
These last 2 errors are still considered fatal, but shouldn't be and are most
likely transient. Failing to open a packfile that was successfully opened
before can for instance happen when the file is removed by another process, or
if it somehow become corrupted. Failing the removal of the pack-file should no
longer be an issue, but if it fails, we can also ignore it with the reasoning
that the next repack will take care of it.

Reviewed By: sfilipco

Differential Revision: D14441288

fbshipit-source-id: 6c2758c2a88fd5d2d83b55defe3d263ee9f974a1
2019-03-19 16:19:14 -07:00
Arun Kulshreshtha
e697e7d994 types: add types for batch responses
Summary: In order to send batch responses from the API server for data fetching operations, we need to define the types sent over the wire from within `/scm/hg/lib` so that we can deserialize them from within Mercurial. For ease of use, these types implement `IntoIterator` to allow easily iterating over the content (performing type conversions where needed).

Reviewed By: quark-zju

Differential Revision: D14517259

fbshipit-source-id: 5ee867d8386e6b99cb5b4ed96338aeb7eb6a3e44
2019-03-19 14:28:48 -07:00
Arun Kulshreshtha
2418ee548c types: add mock values for Node and Key
Summary: When writing tests, it is often desirable to be able to quickly get a dummy value for a `Node` hash or `Key`. Trying to construct one on the spot can be overly verbose, so let's define some mock values that can be used by tests. This is similar to what Mononoke does (e.g., https://fburl.com/p9u55uye).

Reviewed By: quark-zju

Differential Revision: D14517258

fbshipit-source-id: e3d4cdd60010f44ca681d7a87e6124fe79f8a4c6
2019-03-19 14:28:48 -07:00
Arun Kulshreshtha
ef3f3dea44 types: rename LooseHistoryEntry and PackHistoryEntry
Summary: `LooseHistoryEntry` and `PackHistoryEntry` aren't the best names for these types, since the latter is what most users should use, whereas the former should only typically used for data transmission. As such, we should rename these to clarify the intent.

Differential Revision: D14512749

fbshipit-source-id: 5293df89766825077b2ba07224297b958bf46002
2019-03-18 19:50:19 -07:00
Xavier Deguillard
41d275ad36 revisionstore: ignore transient errors during repack
Summary:
Corrupted packfiles, or background removal of them could cause repack to fail,
let's simply ignore these transient errors and continue repacking.

Reviewed By: DurhamG

Differential Revision: D14373901

fbshipit-source-id: afe88e89a3bd0d010459975abecb2fef7f8dff6f
2019-03-11 18:15:45 -07:00
Stefan Filip
2eb3c24956 configparser: upgrade crate to rust edition 2018
Summary: As requested in D14380687.

Differential Revision: D14393014

fbshipit-source-id: 365c713b6f5a106cef0b945e63f224b7651d0e8f
2019-03-11 15:32:55 -07:00
Stefan Filip
3f33e9f3e9 configparser: fix XDG config loading
Summary:
The spec for both XDG and Mercurial say that when the XDG_CONFIG_HOME variable
is not set, we should default to $HOME/.config.

Windows and macOS also have a designated config folder outside of the home directory.
The `dirs` crate provides consistent access to this folder. I see no harm in looking at config
 folders across all operating systems.

Reviewed By: quark-zju

Differential Revision: D14380686

fbshipit-source-id: 5e5a9cd4694aaa49fbc526f4917dc4afdaeb9842
2019-03-11 15:32:55 -07:00
Stefan Filip
403e1c7ad2 configparser: rustfmt on hg.rs
Summary: Automatic formatting using rustfmt

Differential Revision: D14380687

fbshipit-source-id: 5f7832419b0941c00e2399c902454862580988a4
2019-03-11 15:32:55 -07:00
Stefan Filip
41e75fce3f manifest: fix infinite loop when cursor encounters error
Summary:
quark-zju noticed in code review that `Cursor` could get into an infinite loop when
it's results would be collected into a Vec<_>. That was the motive that I
needed to update `Cursor` to transition to `State::Done` when the cursor
encounters an error. Previously I felt that users of `Cursor` would only be
empowered by having the ability to retry the failure.

Reviewed By: quark-zju

Differential Revision: D14393590

fbshipit-source-id: b3e0974ac15d62f3f17790229121c0dec3a6149e
2019-03-11 15:27:52 -07:00
Jun Wu
6de9bec782 config: stop %include from scaning directories
Summary:
`listdir` makes it more expensive to detect config changes. We no longer need
it. Therefore drop the feature.

Reviewed By: markbt

Differential Revision: D13875655

fbshipit-source-id: 147adce45021c7b028aada5c40f498c2fd58c7f5
2019-03-08 16:57:06 -08:00
Stefan Filip
b7dee64bd2 manifest: fix tree entry serialization
Summary:
Follow up from D14178264.

Two changes:
 * tree manifest entries must end with a line feed
 * `t` is the byte that flags a directory

Reviewed By: DurhamG

Differential Revision: D14368316

fbshipit-source-id: b0b46c876649b8f25bf0ecdb1266527dbeb33796
2019-03-07 17:51:39 -08:00
Stefan Filip
660992a50a manifest: add tree::diff(Tree, Tree)
Summary:
`manifest::tree::diff()` returns an iterator over the differences between two
tree manifests.

I chose a function that takes two parameters over a method on Tree because it
felt more clear to write `left` and `right`. Also because I am not sure how
iterators would be abstracted on a trait.

Differential Revision: D14347656

fbshipit-source-id: 537574070cd18b08c77b3cd1cf4cff38d77fbf81
2019-03-07 17:46:44 -08:00
Stefan Filip
2deb0e6e42 manifest: add tree::Cursor and Tree::files()
Summary:
Cursor is a utility for iterating over a manifest tree. In this diff it is used
to implement Files. In the future it will be used to do a diff between two tree
manifests.

I am not sure how to describe an iterator return value in the Manifest trait so
I kept the function on the tree only for now. Looking forward to hearing your
suggestions.

Differential Revision: D14347655

fbshipit-source-id: ffd856443d8abe3ebd0557a096bf7a5ec46312d3
2019-03-07 17:46:44 -08:00
Xavier Deguillard
f868d77cd1 revisionstore: use remove_file from vfs.rs
Summary: The historypack wasn't using remove_file from vfs which was causing repack to fail.

Reviewed By: sfilipco

Differential Revision: D14373649

fbshipit-source-id: 2d87f24bda541bc011ed38533db1ac7bdddc81e3
2019-03-07 15:24:10 -08:00
Stefan Filip
c305e13566 manfiest: mark FileMetadata as Copy
Summary:
`Node` is marked as `Copy`. `FileMetadata` is not much more than `Node` so it
seems pretty clear that it should be marked `Copy`.

Reviewed By: DurhamG

Differential Revision: D14347657

fbshipit-source-id: 939abf88087bc8c6f942047a08d6a4a0d61e053f
2019-03-07 11:20:07 -08:00
Stefan Filip
5b370ffb72 manifest: move tree link to a separate file
Summary:
Cleaning up the `mod.rs` file so that it provides more signal.
`Link` is an internal implementation detail that other internal components may depend on so it is a great candidate to be moved to a dedicated file.

Differential Revision: D14347654

fbshipit-source-id: e5b5a42faf1e9f9c4a0591e5bd94182391ed511f
2019-03-07 11:20:07 -08:00
Stefan Filip
6d9dc154ca manifest: add flush function to manifests
Summary:
Save, finalize, flush, they mean about the same thing.

The first thing to note is that this implementation is not complete because
the parents are not correctly passed into the hashing function.

The second thing is that store failures make the code a little more complex
than it would have been otherwise.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292713

fbshipit-source-id: 807d7a385a62cb5f4948f1781d3146eaa6502ca9
2019-03-05 16:12:48 -08:00
Stefan Filip
25edcc014b manifest: inline store_entry_to_links
Summary:
This function is a bit on it's own with the removal of the pair conversion.
Since it is used in only one place it makes sense to inline it.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292712

fbshipit-source-id: abbf1dc70d61c0ad039f5bc5ed5277d0770e3899
2019-03-05 16:12:48 -08:00
Stefan Filip
c5cc253234 manifest: refactor tests to use store::Entry::from_elements
Summary:
Working on the save mechanism I realized that links_to_store_entry is not that
useful because we can avoid the failure states where we would try to serialize
an ephemeral node. I am removing that function and converting the code that was
using that function to using the Entry constructor directly.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292714

fbshipit-source-id: 54ef46670319c27d90fc78511a1eb6abf47d3acf
2019-03-05 16:12:48 -08:00
Stefan Filip
43fb573c23 types: add explicit conversions from owned paths types to unsized ref
Summary:
There are scenarios where an &PathComponentBuf or a &RepoPath will show up.
An example when using get from a HashMap. These are not the references that we
are looking for. We want &PathComponent and &RepoPath respectively. Adding
explicit conversions.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292711

fbshipit-source-id: 29f4de25c2ffebf7f009e4f2515e0ba8f0371ae0
2019-03-05 16:12:47 -08:00
Stefan Filip
765659d505 manifest: update tree manifest to take owneship of Store
Summary:
This is what Rust is telling us to do. The situation that triggers this update is
writing to the store. Particularly when the store is an in memory hashmap we
need to have a mutable borrow to the hashmap to insert into it. From a general
point of view this means that any sharing of the store between different
instances of a manifest will have to be handled by the struct that implements
the `Store` trait.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292716

fbshipit-source-id: 6e789527dbdf3cd3ffe967f4900251bf31f7d6b2
2019-03-05 16:12:47 -08:00
Stefan Filip
fcc560357a types: add RepoPathBuf::pop()
Summary:
The practical aspect of this method comes when iterating over a tree and having
to maintain the current path. When going deep we will be pushing path
components and when coming back we will be poping path components.

I am not sure if it makes sense to return the path component or not. However I
believe that we should return some sort of error when RepoPath is empty.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14292715

fbshipit-source-id: 4ef1e10de7a60775340063b5baa317d3d626bc64
2019-03-05 16:12:47 -08:00
Stefan Filip
14f26aa355 manifest: add remove implementation for tree
Summary:
Removes a file from the manifest. Nothing special for it.

(Note: this ignores all push blocking failures!)

Reviewed By: quark-zju

Differential Revision: D14276645

fbshipit-source-id: 85e8ffd6cffee426c73eb627484dfa5a866a364b
2019-03-05 16:12:47 -08:00