Commit Graph

623 Commits

Author SHA1 Message Date
Xavier Deguillard
f170cceea2 revisionstore: Repackable::delete now takes the ownership of self.
Summary:
On some platforms, removing a file can fail if it's still mapped or opened. In
mercurial, this can happen during repack as the datapacks are removed while
still being mapped.

Reviewed By: DurhamG

Differential Revision: D13615938

fbshipit-source-id: fdc1ff9370e2767e52ee1828552f4598105f784f
2019-01-14 21:14:13 -08:00
Xavier Deguillard
da3dd2319f revisionstore: remove repacked pack files
Summary:
After repacking the data/history packs, we need to cleanup the
repacked files. This was an omission from D13363853.

Reviewed By: markbt

Differential Revision: D13577592

fbshipit-source-id: 36e7d5b8e86affe47cdd10d33a769969f02b8a62
2019-01-11 16:54:15 -08:00
Xavier Deguillard
ce16778656 remotefilelog: set proper file permissions on closed mutable packs.
Summary:
The python version of the mutable packs set the permission to read-only after
writing them, while the rust version keeps them writeable. Let's make the rust
one more consistent.

Reviewed By: markbt

Differential Revision: D13573572

fbshipit-source-id: 61256994562aa09058a88a7935c16dfd7ddf9d18
2019-01-11 16:54:15 -08:00
Mark Thomas
98417b1ffb configparser: fix warning about unused Result
Summary:
Use of `write!` requires checking for errors, however in this case, there is no
need to use `write!`, as we just want the error as a string.

Reviewed By: ikostia

Differential Revision: D13596497

fbshipit-source-id: 5892025344936936188cf3a8ca227e71eff57d55
2019-01-08 06:19:55 -08:00
Jun Wu
f6158659f8 configparser: use hardcoded system config path on Windows
Summary:
When I was debugging an eden importer issue with Puneet, we saw errors caused
by important extensions (ex. remotefilelog, lz4revlog) not being loaded.  It
turned out that configpaser was checking the "exe dir" to decide where to
load "system configs". For example, If we run:

  C:\open\fbsource\fbcode\scm\hg\build\pythonMSVC2015\python.exe eden_import_helper.py

The "exe dir" is "C:\open\fbsource\fbcode\scm\hg\build", and system config is
not there.

Instead of copying "mercurial.ini" to every possible "exe dir", this diff just
switches to a hard-coded system config path. It's now consistent with what we
do on POSIX systems.

The logic to copy "mercurial.ini" to "C:\open\fbsource\fbcode\scm\hg" or
"C:\tools\hg" become unnecessary and are removed.

Reviewed By: singhsrb

Differential Revision: D13542939

fbshipit-source-id: 5fb50d8e42d36ec6da28af29de89966628fe5549
2018-12-22 01:53:03 -08:00
Saurabh Singh
b193e23dd2 test-check-fix-code: unbreak test by fixing copyrights
Summary:
`test-check-fix-code.t` was failing due to copyright header missing
from certain files. This commit fixes the files by running

```
contrib/fix-code.py FILE
```

as suggested in the failure message.

Reviewed By: DurhamG

Differential Revision: D13538506

fbshipit-source-id: d8063c9a0e665377a9976abeccb68fbef6781950
2018-12-21 10:03:26 -08:00
Jun Wu
22e9000fc9 lz4-pyframe: add compresshc
Summary:
Unfortunately required symbols are not exposed by lz4-sys. So we just declare
them ourselves.

Make sure it compresses better:

  In [1]: c=open('/bin/bash').read();
  In [2]: from mercurial.rust import lz4
  In [3]: len(lz4.compress(c))
  Out[3]: 762906
  In [4]: len(lz4.compresshc(c))
  Out[4]: 626970

While it's much slower for larger data (and compresshc is slower than pylz4):

  Benchmarking (easy to compress data, 20MB)...
            pylz4.compress: 10328.03 MB/s
       rustlz4.compress_py:  9373.84 MB/s
          pylz4.compressHC:  1666.80 MB/s
     rustlz4.compresshc_py:  8298.57 MB/s
          pylz4.decompress:  3953.03 MB/s
     rustlz4.decompress_py:  3935.57 MB/s
  Benchmarking (hard to compress data, 0.2MB)...
            pylz4.compress:  4357.88 MB/s
       rustlz4.compress_py:  4193.34 MB/s
          pylz4.compressHC:  3740.40 MB/s
     rustlz4.compresshc_py:  2730.71 MB/s
          pylz4.decompress:  5600.94 MB/s
     rustlz4.decompress_py:  5362.96 MB/s
  Benchmarking (hard to compress data, 20MB)...
            pylz4.compress:  5156.72 MB/s
       rustlz4.compress_py:  5447.00 MB/s
          pylz4.compressHC:    33.70 MB/s
     rustlz4.compresshc_py:    22.25 MB/s
          pylz4.decompress:  2375.42 MB/s
     rustlz4.decompress_py:  5755.46 MB/s

Note python-lz4 was using an ancient version of lz4. So there could be differences.

Reviewed By: DurhamG

Differential Revision: D13528200

fbshipit-source-id: 6be1c1dd71f57d40dcffcc8d212d40a853583254
2018-12-20 17:54:22 -08:00
Jun Wu
4f24bffdde cpython-ext: move pybuf to cpython-ext
Summary:
The `pybuf` provides a way to read `bytes`, `bytearray`, some `buffer` types in
a zero-copy way. The main benefit is to use same code to support different
input types. It's copied to a couple of places. Let's move it to `cpython-ext`.

Reviewed By: DurhamG

Differential Revision: D13516206

fbshipit-source-id: f58881c4bfe651a6fdb84cf317a74c3c8d7a4961
2018-12-20 17:54:22 -08:00
Jun Wu
f23c6bc7e3 cpython-ext: add a way to pre-allocate PyBytes
Summary: Make it possible to write content directly into a PyBytes buffer.

Reviewed By: DurhamG

Differential Revision: D13528202

fbshipit-source-id: 8c0a4ed030439a8dc40cdfbd72b1f6734a8b2036
2018-12-20 17:54:22 -08:00
Jun Wu
6e88ac4794 lz4-pyframe: provide decompress_into API
Summary:
This allows decompressing into a pre-allocated buffer. After some experiments,
it seems `bytearray` will just break too many things, ex:

- bytearray is not hashable
- bytearray[index] returns an int
- a = bytearray('x'); b = a; b += '3' # will mutate 'a'
- ''.join([bytearray('')]) will raise TypeError

Therefore we have to use zero-copy `bytes` instead, which is less elegent. But
this API change is a step forward.

Reviewed By: DurhamG

Differential Revision: D13528201

fbshipit-source-id: 1cfaf5d55efdc0d6c0df85df9960fe9682028b08
2018-12-20 17:54:22 -08:00
Jun Wu
7831e2a4ce cpython-ext: add ways to zero-copy Vec<u8> into a Python object
Summary:
I need to convert `Vec<u8>` to a Python object in a zero-copy way for rustlz4
performacne.

Assuming Python and Rust use the same memory allocator, it's possible to transfer
the control of a malloc-ed pointer from Rust to Python. Use this to implement
zero-copy. PyByteArrayObject is chosen because its struct contains such a pointer.
PyBytes cannot be used as it embeds the bytes, without using a pointer.

Sadly there are no CPython APIs to do this job. So we have to write to the raw
structures. That means the code will crash if python is replaced by
python-debug (due to Python object header change). However, that seems less an
issue given the performance wins. If python-debug does become a problem, we can
try vendoring libpython directly.

I didn't implement a feature-rich `PyByteArray` Rust object. It's not easy to
do so outside the cpython crate. Most helper macros to declare types cannot be
reused, because they refer to `::python`, which is not available in the current
crate.

Reviewed By: DurhamG

Differential Revision: D13516209

fbshipit-source-id: 9aa089b309beb71d4d21f6c63fcb97dbc798b5f8
2018-12-20 17:54:22 -08:00
Jun Wu
35c85018cd lz4-pyframe: add a benchmark
Summary:
This gives some sense about how fast it is.

Background: I was trying to get rid of python-lz4, by exposing this to Python.
However, I noticed it's 10x slower than python-lz4. Therefore I added some
benchmark here to test if it's the wrapper or the Rust lz4 code.

It does not seem to be this crate:

```
  # Pure Rust
  compress (100M)                77.170 ms
  decompress (~100M)             67.043 ms

  # python-lz4
  In [1]: import lz4, os
  In [2]: b=os.urandom(100000000);
  In [3]: %timeit lz4.compress(b)
  10 loops, best of 3: 87.4 ms per loop
```

Reviewed By: DurhamG

Differential Revision: D13516205

fbshipit-source-id: f55f94bbecc3b49667ed12174f7000b1aa29e7c4
2018-12-20 17:54:21 -08:00
Jun Wu
b3893b3d3c indexedlog: add methods on Log to do prefix lookups
Summary:
This exposes the underlying lookup functions from `Index`.

Alternatively we can allow access to `Index` and provide an `iter_started_from`
method on `Log` which takes a raw offset. I have been trying to avoid exposing
raw offsets in public interfaces, as they would change after `flush()` and cause
problems.

Reviewed By: markbt

Differential Revision: D13498303

fbshipit-source-id: 8b00a2a36a9383e3edb6fd7495a005bc985fd461
2018-12-20 15:50:55 -08:00
Jun Wu
3237b77e4c indexedlog: add APIs to lookup by prefix
Summary:
This is the missing API before `indexedlog::Index` can fit in the
`changelog.partialmatch` case. It's actually more flexible as it can provide
some example commit hashes while the existing revlog.c or radixbuf
implementation just error out saying "ambiguous prefix".

It can be also "abused" for the semantics of sorted "sub-keys". By replace
"key" with "key + subkey" when inserting to the index. Looking up using "key"
would return a lazy result list (`PrefixIter`) sorted by "subkey". Note:
the radix tree is NOT efficient (both in time and space) when there are common
prefixes. So this use-case needs to be careful.

Reviewed By: markbt

Differential Revision: D13498301

fbshipit-source-id: 637856ebd761734d68b20c15866424b1d4518ad6
2018-12-20 15:50:55 -08:00
Jun Wu
562b7a1704 indexedlog: add a function to convert base16 to base256
Summary: This will be used in prefix lookups.

Reviewed By: markbt

Differential Revision: D13498300

fbshipit-source-id: 3db7a21d6f35a18699d9dc3a0eca71a5410e0e61
2018-12-20 15:50:55 -08:00
Jun Wu
443a8f33b3 indexedlog: move binary indexedlog_dump out
Summary:
It makes testing duplicated - now `cargo test` would try running tests on 2 entry points:
lib.rs and indexedlog_dump.rs.  Move it to a separate crate to solve the issue.

Reviewed By: markbt

Differential Revision: D13498266

fbshipit-source-id: 8abf07c1272dfa825ec7701fd8ea9e0d1310ec5f
2018-12-18 08:17:21 -08:00
Jun Wu
61b1a5f475 indexedlog: fix rustc warnings
Summary: `write!` result needs to be used.

Reviewed By: markbt

Differential Revision: D13471967

fbshipit-source-id: d48752bcac05dd33b112679d7faf990eb8ddd651
2018-12-17 12:10:52 -08:00
Xavier Deguillard
79164e920c revisionstore: replace rand::chacha with rand_chacha
Summary: The former is deprecated and thus compiling revisionstore shows many warnings.

Reviewed By: markbt

Differential Revision: D13379278

fbshipit-source-id: d4b4662a1ad00997de4c46274deaf22f48487328
2018-12-17 12:07:22 -08:00
Mark Thomas
ca135cd33f cpython-failure: Integrate cpython PyResult with the failure crate
Summary:
Adds a new crate `cpython-result`, which provides a `ResultExt` trait, which
extends the failure `Result` type to allow coversion to `PyResult` by
converting the error to an appropriate Python Exception.

Reviewed By: quark-zju

Differential Revision: D12980782

fbshipit-source-id: 44a63d31f9ecf2f77efa3b37c68f9a99eaf6d6fa
2018-12-14 06:43:40 -08:00
Mark Thomas
cf4b52c19c mutationstore: add mutationstore
Summary:
The mutationstore is a new store for recording records of commit mutations for
commits that are not in the local repository.

It uses an indexedlog to store the data.  Each mutation entry corresponds to
the information the mutation that led to the creation of a particular commit,
which is recorded as the successor in the entry.

Entries can come from three possible places:

* `Commit` metadata for a commit not available locally
* `Obsmarkers` for repos that have been migrated from evolution tracking
* `Synthetic` for entries created synthetically, e.g. by a pullcreatemarkers
  implementation.

The other commits referred to in an entry must predate the successor commit.
For entries that originated from commits, this is ensured, as the successor
commit hash includes the other commit hashes.  For other entry types, it is
an error to refer to later commits, and any entry that causes a cycle will
be ignored.

Reviewed By: quark-zju

Differential Revision: D12980773

fbshipit-source-id: 040d3f7369a113e710ed8c9f61fabec6c5ec9258
2018-12-14 06:43:40 -08:00
Mark Thomas
1346ff92c4 types: implement Debug for Node
Summary:
The derived debug for Node prints out each byte as a decimal number.  Instead,
make the Debug output for nodes look like `Node("hexstring")`.

Reviewed By: DurhamG

Differential Revision: D12980775

fbshipit-source-id: 042cbf6eade8403759684969e1f69f7f4e335582
2018-12-14 06:43:40 -08:00
Mark Thomas
88ab626e9a types: Add Nodes::random_distinct to randomly generate sets of nodes
Summary:
Add a utility function for tests to generate a vector of random nodes.  This
will be used in future tests.

Reviewed By: DurhamG

Differential Revision: D12980784

fbshipit-source-id: 73fc8643503e11a46a845671df94c912a5e49d23
2018-12-14 06:43:40 -08:00
Mark Thomas
d0c03f6aaf types: Add WriteNodeExt and ReadNodeExt
Summary:
Add traits that extend `std::io::Read` and `std::io::Write` to implement new
`read_node` and `write_node` methods, allowing simple reading and writing of
binary nodes from and to streams.

Reviewed By: DurhamG

Differential Revision: D12980778

fbshipit-source-id: fc6751cd43a1693a5a5a3ac93aea74aec5fda4fe
2018-12-14 06:43:40 -08:00
Xavier Deguillard
5307fd8867 revisionstore: implement basic repack in rust
Summary:
The future of mercurial is rust, and one of the missing piece is repacking of data/history packs. For now, let's implement a very basic packing strategy that just pulls all the packs into one, with one small optimization that puts all the delta chains close together in the output file.

At first, it's expected that this code will be driven by the existing python code, but more and more will be done in rust as time goes.

Reviewed By: DurhamG

Differential Revision: D13363853

fbshipit-source-id: ad1ac2039e1732f7141d99abf7f01804a9bde097
2018-12-12 12:44:03 -08:00
Jun Wu
421c7b3f45 indexedlog: add a tool to dump indexedlog content
Summary: The tool can dump indexedlog content. Useful for manually investigating issues.

Reviewed By: DurhamG

Differential Revision: D13051387

fbshipit-source-id: 8687a1aa9dfb54776e80f184208c49da2492c34d
2018-12-06 14:57:52 -08:00
Jun Wu
54dc931140 indexedlog: use inlined leaf entries to further reduce index size
Summary:
Add a new entry type - INLINE_LEAF, which embeds the EXT_KEY and LINK entries
to save space.

The index size for referred keys is significantly reduced with little overhead:

  index insertion (owned key)     3.732 ms
  index insertion (referred key)  3.604 ms
  index flush                    11.868 ms
  index lookup (memory)           1.159 ms
  index lookup (disk, no verify)  2.175 ms
  index lookup (disk, verified)   4.303 ms
  index size (5M owned keys)     216626039
  index size (5M referred keys)   96616431
    11.87s user 2.96s system 98% cpu 15.107 total

The breakdown of the "5M referred keys" size is:

  type          count     bytes
  radixes       1729472   33835772
  inline_leafs  5000000   62780651

There are no other kinds of entries stored.

Previously, the index size of referred keys is:

  index size (5M referred keys)  136245815 bytes

So it's 136MB -> 96MB, 40% decrease.

Reviewed By: DurhamG

Differential Revision: D13036801

fbshipit-source-id: 27e68e4b6c332c1dc419abc6aba69271952e4b3d
2018-12-06 14:57:52 -08:00
Jun Wu
a4958163ee indexedlog: optimize size of radix entries (BC)
Summary:
Replace the 20-byte "jump table" with 3-byte "flag + bitmap". This saves space
for indexes less than 4GB. There are some reserved bits in the "flag" so if we
run into space issues when indexes are larger than 4GB, we can try adding
6-byte integer, or VLQ back without breaking backwards-compatibility.

It seems to hurt flush performance a bit, because we have to scan the child
array twice. However, lookup (the most important performance) does not change
much. And the index is more compact.

After:

  index flush                    19.644 ms
  index lookup (disk, no verify)  2.220 ms
  index lookup (disk, verified)   4.067 ms
  index size (5M owned keys)     216626039 bytes
  index size (5M referred keys)  136245815 bytes

Before:

  index flush                    16.764 ms
  index lookup (disk, no verify)  2.205 ms
  index lookup (disk, verified)   4.030 ms
  index size (5M owned keys)     240838647 bytes
  index size (5M referred keys)  160458423 bytes

For the "referred key" case, it's 160->136MB, 17% decrease.

A detailed break down of components of index is:

After:

  type       count     bytes (using owned keys)
  radixes    1729472   33835772
  links      5000000   27886336
  leafs      5000000   44629384
  keys       5000000  110000000

  type       count     bytes (using referred keys)
  radixes    1729472   33835772
  links      5000000   27886336
  leafs      5000000   44629384
  ext_keys   5000000   29894315

Before:

  type       count     bytes (using owned keys)
  radixes    1729472   58048380
  links      5000000   27886336
  leafs      5000000   44903923
  keys       5000000  110000000

  type       count     bytes (using referred keys)
  radixes    1729472   58048380
  links      5000000   27886336
  leafs      5000000   44629384
  ext_keys   5000000   29894315

Leaf nodes are taking too much space. It seems the next big optimization might
be inlining ext_keys into leafs.

Reviewed By: DurhamG, markbt

Differential Revision: D13028196

fbshipit-source-id: 6043b16fd67a497eb52d20a17e153fcba5cb3e81
2018-12-06 14:57:52 -08:00
Jun Wu
d8117b3b04 indexedlog: increase key count for size test
Summary:
Since the size test only runs once, we can use a larger number of keys. This is
closer to some production use-cases.

`cargo bench size` shows:

  index size (5M owned keys)     240838647
  index size (5M referred keys)  160458423

It currently uses 32 bytes per key for 5M referred keys.

Reviewed By: markbt

Differential Revision: D13027880

fbshipit-source-id: 726f5fb2da056e77ab93d82fda9f1afa500d0a8d
2018-12-06 14:57:52 -08:00
Jun Wu
55b6331aa4 indexedlog: add more benchmarks
Summary:
Add benchmarks about index sizes, and a benchmark of insertion using key
references.

An example `cargo bench` result running on my devserver looks like:

  index insertion (owned key)     3.551 ms
  index insertion (referred key)  3.713 ms
  index flush                    20.648 ms
  index lookup (memory)           1.087 ms
  index lookup (disk, no verify)  2.041 ms
  index lookup (disk, verified)   4.347 ms
  index size (owned key)            886010
  index size (referred key)         534298

Reviewed By: markbt

Differential Revision: D13027879

fbshipit-source-id: 70644c504026ffee2122d857d5035f5b7eea4f42
2018-12-06 14:57:52 -08:00
Jun Wu
d7129256d4 indexedlog: switch checksum table to little endian (BC)
Summary:
For checksum values like xxhash, there is no benefit using big endian. Switch
to little endian so it's slightly slightly faster on the major platforms we
care about.

This is a breaking change. However, the format is not used in production yet.
So there is no migration code.

Reviewed By: markbt

Differential Revision: D13015465

fbshipit-source-id: ca83d19b3328370d089b03a33e848e64b728ef2a
2018-12-06 14:57:52 -08:00
Jun Wu
75b4f92c44 indexedlog: support different checksum functions for Log entries (BC)
Summary:
Previously, the format of an Log entry is hard-coded - length, xxhash, and
content. The xxhash always takes 8 bytes.

For small (ex. 40-byte) entries, xxhash32 is actually faster and takes less
disk space.

Introduce the "entry flags" concept so we can store some metadata about what
checksum function to use. The concept could be potentially used to support
other new format changes at per entry level in the future.

As we're here, also support data without checksums. That can be useful for
content with its own checksum, like a blob store with its own SHA1 integrity
check.

Performance-wise, log insertion is slower (but the majority insertaion overhead
would be on the index part), iteration is a little bit faster, perhaps because
the log can use less data.

Before:

  log insertion                  15.874 ms
  log iteration (memory)          6.778 ms
  log iteration (disk)            6.830 ms

After:

  log insertion                  18.114 ms
  log iteration (memory)          6.403 ms
  log iteration (disk)            6.307 ms

Reviewed By: DurhamG, markbt

Differential Revision: D13051386

fbshipit-source-id: 629c251633ecf85058ee7c3ce7a9f576dfac7bdf
2018-12-06 14:57:52 -08:00
Jun Wu
049cd99f05 indexedlog: use non-VLQ encoding for xxhash (BC)
Summary:
Xxhash result won't usually have leading zeros. So VLQ encoding is not an
efficient choice. Use non-VLQ encoding instead.

Performance wise, this is noticably faster than before:

  log insertion                  14.161 ms
  log insertion with index      102.724 ms
  log flush                      11.336 ms
  log iteration (memory)          6.351 ms
  log iteration (disk)            7.922 ms
    10.18s user 3.66s system 97% cpu 14.218 total
  log insertion                  13.377 ms
  log insertion with index       97.422 ms
  log flush                      11.792 ms
  log iteration (memory)          6.890 ms
  log iteration (disk)            7.139 ms
    10.20s user 3.56s system 97% cpu 14.117 total
  log insertion                  14.573 ms
  log insertion with index       94.216 ms
  log flush                      18.993 ms
  log iteration (memory)          7.867 ms
  log iteration (disk)            7.567 ms
    9.85s user 3.73s system 96% cpu 14.073 total
  log insertion                  15.526 ms
  log insertion with index       98.868 ms
  log flush                      19.600 ms
  log iteration (memory)          7.533 ms
  log iteration (disk)            7.150 ms
    10.13s user 4.02s system 96% cpu 14.647 total
  log insertion                  14.629 ms
  log insertion with index      100.449 ms
  log flush                      20.997 ms
  log iteration (memory)          7.299 ms
  log iteration (disk)            7.518 ms
    10.14s user 3.65s system 96% cpu 14.274 total

This is a format-breaking change. Fortunately we haven't really use the old
format in production yet.

Reviewed By: DurhamG, markbt

Differential Revision: D13015463

fbshipit-source-id: 6e7e4f7a845ea8dbf0904b3902740b65cc7467d5
2018-12-06 14:57:52 -08:00
Jun Wu
42c3ef6eb6 indexedlog: add benchmark for "log"
Summary:
Some simple benchmark for "log". The initial result running from my devserver
looks like:

  log insertion                  33.146 ms
  log insertion with index      106.449 ms
  log flush                       9.623 ms
  log iteration (memory)         10.644 ms
  log iteration (disk)           11.517 ms
    13.75s user 3.61s system 97% cpu 17.778 total
  log insertion                  27.906 ms
  log insertion with index      107.683 ms
  log flush                      19.204 ms
  log iteration (memory)         10.239 ms
  log iteration (disk)           11.118 ms
    12.89s user 3.55s system 97% cpu 16.924 total
  log insertion                  31.645 ms
  log insertion with index      109.403 ms
  log flush                       9.416 ms
  log iteration (memory)         10.226 ms
  log iteration (disk)           10.757 ms
    13.07s user 3.02s system 97% cpu 16.423 total
  log insertion                  31.848 ms
  log insertion with index      109.332 ms
  log flush                      18.345 ms
  log iteration (memory)         10.709 ms
  log iteration (disk)           11.346 ms
    13.12s user 3.70s system 97% cpu 17.276 total
  log insertion                  29.665 ms
  log insertion with index      106.041 ms
  log flush                      16.159 ms
  log iteration (memory)         10.367 ms
  log iteration (disk)           11.110 ms
    12.99s user 3.27s system 97% cpu 16.717 total

Reviewed By: markbt

Differential Revision: D13015464

fbshipit-source-id: 035fee6c8b6d0bea4cfe194eed3d58ba4b5ebcb8
2018-12-06 14:57:52 -08:00
Durham Goode
1a3a0bcd72 nodemap: add key iteration
Summary:
An upcoming diff will need the ability to iterate over all the keys in
the store. So let's expose that functionality.

Reviewed By: quark-zju

Differential Revision: D13062575

fbshipit-source-id: a173fcdbbf44e2d3f09f7229266cca6f3e67944b
2018-12-06 11:47:41 -08:00
Durham Goode
60b3bebaff nodemap: python bindings for rust nodemap
Summary: Simple python bindings for the new nodemap rust structure

Reviewed By: quark-zju

Differential Revision: D13062572

fbshipit-source-id: d60407b87bfc19b496de09273a9c8d6b59af0b8b
2018-12-06 11:47:41 -08:00
Durham Goode
e9b755198c nodemap: introduce rust bidirectional node map
Summary:
Introduces a nodemap structure that stores the mapping between two
nodes with bidirectional indexes.

Reviewed By: quark-zju

Differential Revision: D13047698

fbshipit-source-id: 967bf4b26a4b57e4fa2421a342edb21d3a5adbf6
2018-12-06 11:47:41 -08:00
Durham Goode
668ba5165c indexedlog: add an iterator function for iterating over keys
Summary:
You can currently iterate over indexlog entries, but there's no way to
iterate over the keys without keeping a copy of the index function with you.
Let's add a key iterator function.

Reviewed By: quark-zju

Differential Revision: D13010744

fbshipit-source-id: 1fcaf959ae82417e5cbafae7c1927c3ae8f8e76a
2018-12-06 11:47:41 -08:00
Arun Kulshreshtha
c60a188e34 mononokeapi: support both http and https
Summary: Allow MononokeClient to support both HTTP and HTTPS. The protocol use is determined by the scheme of the server base URI passed in. For example, specifying `https://mononoke-api.internal.tfbnw.net` would use HTTPS, whereas specifying `http://localhost:12345` would use HTTP. This is useful for local testing.

Reviewed By: DurhamG

Differential Revision: D13089197

fbshipit-source-id: 2da72ac98c60746200334e4bcc0e2568abe3073b
2018-12-03 17:46:51 -08:00
Arun Kulshreshtha
365352a0ba mononokeapi: client library for mononoke api server
Summary:
This diff adds a new `mononokeapi` crate, which is a Rust client library for the Mononoke API server. The crate is intended for use beyond Mercurial, and as such attempts to expose functionality in a reasonably generic way.

Right now, the only method supported by this crate is `/health_check`, which is the API server's health check endpoint that simply returns the string "I_AM_ALIVE" on success. Future diffs will expand this crate to include more of the API server's actual functionality. For now, this version serves as a proof of concept of how all the crate will be structured.

The crate currently uses the `hyper` crate for its HTTP client, with `native-tls` for TLS support. Given that the client credentials required for mutual authentication with the Mononoke VIP are encoded in a format that `native-tls` does not understand, some credential format conversion via the `openssl` crate is necessary.

Reviewed By: DurhamG

Differential Revision: D13055687

fbshipit-source-id: cc944abd579ce49928776646c0dcce567f99c3b6
2018-12-03 17:46:51 -08:00
Haozhun Jin
461dabad96 bookmark: Turn BookmarkStore into indexed-log backed
Summary:
Turn BookmarkStore rust implementation into indexed-log backed.
Note that this no longer matches existing mercurial bookmark store
disk representation.

Reviewed By: DurhamG

Differential Revision: D13133605

fbshipit-source-id: 2e0a27738bcec607892b0edab6f759116929c8e1
2018-11-28 10:21:26 -08:00
Kostia Balytskyi
452aab74cd hgmain: use correct slashes in canonicalized paths
Summary:
Before I implement a proper fix [1], let's just use the correct slashes.

[1]
Correct fix is de-verbatimization of canonicalized paths.
So, if `a-symlink->b` and `"C:\a".canonicalize()` produces `\\?\C:\b`, then doing `.push("c/d")` produces `\\?\C:\b\c/d`, where `c/d` is a *single* path component, becuase the path starts with `\\?\`. If there isn't such prefix, it's fine to push forward-slash-separated things into Windows paths.

Differential Revision: D13234288

fbshipit-source-id: 2ca0326bbd91ddc6ffd259153915037264292dc1
2018-11-28 09:20:02 -08:00
Kostia Balytskyi
fcae2817da hgpython: use MAIN_SEPARATOR instead of backslash
Summary: I only tested the original diff of Windows, aparently.

Reviewed By: mitrandir77

Differential Revision: D13188952

fbshipit-source-id: 9dc33cb0eedb8d3c09cb7a734528f71afd7cbe8a
2018-11-26 02:28:05 -08:00
Kostia Balytskyi
60f81d3be2 hg.rust: only use backslashes in canonicalized paths on Win
Summary:
We need to canonicalize `current_exe` to resolve symlinks on OSX.

Unfortunetely, on there's no way to just resolve symlinks and not
turn path into a `\\?\` on Windows, AFAIK.

Once the path is canonicalized on Windows, it starts with `\\?\` and
forward slashes are no longer recognized as valid separators.

Here's a demonstration:
```
> cat src\main.rs
use std::path::Path;
fn main() {
    let p = Path::new("\\\\?\\C:\\Code\\fbsource\\fbcode\\scm\\hg\\mercurial/entrypoint.py");
    for comp in p.components() { println!("{:?}", comp); }
    println!("{:?} exists: {}", p, p.exists());
    let p = Path::new("\\\\?\\C:\\Code\\fbsource\\fbcode\\scm\\hg\\mercurial\\entrypoint.py");
    for comp in p.components() { println!("{:?}", comp); }
    println!("{:?} exists: {}", p, p.exists());
    let p = Path::new("C:\\Code\\fbsource\\fbcode\\scm\\hg\\mercurial/entrypoint.py");
    for comp in p.components() { println!("{:?}", comp); }
    println!("{:?} exists: {}", p, p.exists());

}

> cargo run
Prefix(PrefixComponent { raw: "\\\\?\\C:", parsed: VerbatimDisk(67) })
RootDir
Normal("Code")
Normal("fbsource")
Normal("fbcode")
Normal("scm")
Normal("hg")
Normal("mercurial/entrypoint.py")
"\\\\?\\C:\\Code\\fbsource\\fbcode\\scm\\hg\\mercurial/entrypoint.py" exists: false
Prefix(PrefixComponent { raw: "\\\\?\\C:", parsed: VerbatimDisk(67) })
RootDir
Normal("Code")
Normal("fbsource")
Normal("fbcode")
Normal("scm")
Normal("hg")
Normal("mercurial")
Normal("entrypoint.py")
"\\\\?\\C:\\Code\\fbsource\\fbcode\\scm\\hg\\mercurial\\entrypoint.py" exists: true
Prefix(PrefixComponent { raw: "C:", parsed: Disk(67) })
RootDir
Normal("Code")
Normal("fbsource")
Normal("fbcode")
Normal("scm")
Normal("hg")
Normal("mercurial")
Normal("entrypoint.py")
"C:\\Code\\fbsource\\fbcode\\scm\\hg\\mercurial/entrypoint.py" exists: true
```

Differential Revision: D13176266

fbshipit-source-id: 5f35a3263e058d179b237c80f28e4fdf44105576
2018-11-23 04:28:11 -08:00
Kostia Balytskyi
c646d8aa2a hg.rust: canonicalize the main binary address
Summary:
This is important on OSX where `current_exe` will return the symlink address if `hg.rust` is a symlink.
Therefore, if you create a symlink to the `hg.rust` in the repo (like tests do), repo Python code won't be picked up, and the system code will be.

Reviewed By: mitrandir77

Differential Revision: D13138333

fbshipit-source-id: ffdf27329609d77bee4b8a2eecc47e02cb2dd5c8
2018-11-21 05:47:45 -08:00
Jun Wu
61f0a3da45 tests: add a test-check test that runs fix-code.py
Summary:
Add "--dry-run" for fix-code.py and use it in test-check.
This avoids license header and version = "*" issues.

Reviewed By: ikostia

Differential Revision: D10213070

fbshipit-source-id: 9fdd49ead3dfcecf292d5f42c028f20e5dde65d3
2018-11-15 18:54:06 -08:00
Jun Wu
616306543b codemod: use explicit versions in Cargo.toml
Summary:
This is done by running `fix-code.py`. Note that those strings are
semvers so they do not pin down the exact version. An API-compatiable upgrade
is still possible.

Reviewed By: ikostia

Differential Revision: D10213073

fbshipit-source-id: 82f90766fb7e02cdeb6615ae3cb7212d928ed48d
2018-11-15 18:54:06 -08:00
Jun Wu
647f7dfb8e indexedlog: fix misc benchmark
Summary:
The "misc" benchmark requires the base16 module to be public. It was made
private in a previous change. Let's make it public again so the benchmark can
run.

Reviewed By: singhsrb

Differential Revision: D13015031

fbshipit-source-id: 0dc1542803aae290de26651e367898eebfc95e83
2018-11-09 20:49:56 -08:00
Liubov Dmitrieva
372c5594b4 Scm Daemon: support cat tokens via passing token type to the URL
Summary: This is the final step to make CAT authentification work

Reviewed By: markbt

Differential Revision: D12975214

fbshipit-source-id: e445ca502f8abaac914140f3f30476d50b3c2fbc
2018-11-09 11:48:50 -08:00
Liubov Dmitrieva
299ecd354a Scm Daemon: support CAT tokens
Summary:
To solve friction with OAuth tokens we will support CAT tokens as well in Scm Daemon.

Icebreaker support has been done in D12942971

CATs tokens can be generated on dev servers without user (via the tool based on TLS certs).

So we are going to use them in the next diff.

This will allow us to enable token-less cloud sync for everyone, scm daemon will use CATs.

Reviewed By: markbt

Differential Revision: D12962342

fbshipit-source-id: 173301387ee446622bf77b2d6bed6934b5ced2c3
2018-11-09 09:51:26 -08:00
Liubov Dmitrieva
8a024ed563 Scm Daemon: make scm daemon more robust with expiration of tokens
Summary:
Basically if Unauthorized it will try to access the token again and restart all the subscriptions
rather than trying to reconnect with the same token in infinite loop.

We know OAuth tokens have potential to be invalidated.

CAT token (that we are going to support as well) will always be valid for some time - like 1 day, so we need a smooth way to recover from  Unauthorized and issue a fresh token.

Reviewed By: markbt

Differential Revision: D12960843

fbshipit-source-id: 630c446c490b0724df38c61507ee555dc7ed7241
2018-11-09 09:51:26 -08:00
Jun Wu
6a01a5de06 linelog: update README
Summary: This is a backport of my upstream patch https://phab.mercurial-scm.org/D4147

Differential Revision: D12970974

fbshipit-source-id: ed9d8db2e32818e6e5ab3f23f5a0097bfa2cc14e
2018-11-08 12:34:36 -08:00
Jun Wu
f37a5d8df7 rust: upgrade rust to 1.30.0 and bump zstd-sys version
Summary:
The vendored crates were changed by D12811597. Bump `zstd-sys` in `Cargo.toml` to be compatible.
As we're here, also bump rust compiler to 1.30.0 so it's consistent with buck build.

Reviewed By: kulshrax

Differential Revision: D12952552

fbshipit-source-id: 6274bf829b98b16aeb6795209d12aba8b475b46d
2018-11-06 18:13:20 -08:00
Wez Furlong
caad413499 load blobs using hg's rust config and datapack code
Summary:
This diff implements getBlob on top of the mercurial rust
datapack code.  It adds a C++ binding on top of the rust code to
make it easier to use and hooks it up in the hg backing store.

Need to figure this out for our opensource and windows builds:

* Need to teach them how to build and link the rust code
* need to add a windows version of the methods that accept paths;
  this is just a matter of adding a WCHAR version of the functions.

Reviewed By: strager

Differential Revision: D10433450

fbshipit-source-id: 45ce34fb9c383ea6018a0ca858581e0fe11ef3b5
2018-10-31 17:58:17 -07:00
Mark Thomas
2deae85b42 encoding: use Cow for returned types that may be references
Summary:
I broke the Windows build because the return type of `path_to_local_bytes` is
different on Windows and Unix, and so must be dealt with differently.  They are
different because on Windows we often need to make a copy, whereas on Unix we
can just use references to the byte data.  Cows to the rescue: unify them
behind a Cow type.

While we're here, tidy up and unify the docs.

Reviewed By: quark-zju, ikostia

Differential Revision: D12833091

fbshipit-source-id: e02e308e6f81dd3d8ddf33e76c3073f51d3eccc1
2018-10-30 04:07:02 -07:00
Jun Wu
61790b12a9 indexedlog: make it Send
Summary: It needs to be Send to be used in cpython.

Reviewed By: ikostia

Differential Revision: D10250289

fbshipit-source-id: ea57e356a0752764e50db9b6872b5cc4a456303f
2018-10-29 21:02:41 -07:00
Jun Wu
840d242822 indexedlog: revise docs for the index module
Summary:
Make it more detailed for public APIs. Hide too detailed information (file
format).

Reviewed By: DurhamG

Differential Revision: D10250140

fbshipit-source-id: d9d9af9d67984b80f07db13e69bbffdf77e6a30e
2018-10-29 21:02:41 -07:00
Jun Wu
23e41f98a4 indexedlog: revise checksum_table documentation
Summary: Revise ChecksumTable documentation so it's more detailed and accurate.

Reviewed By: DurhamG

Differential Revision: D10250142

fbshipit-source-id: bff89877fb9a65a305e8d8636a200d50c7e2d548
2018-10-29 21:02:41 -07:00
Jun Wu
ecc14e0860 indexedlog: update public documentation for the log module
Summary:
The log module is the "entry point" of other features. Update it so things are
more detailed. I tried to make it more friendly for people without knowledge
about the implementation details.

This could probably be further improved by adding some examples. For now, I'm
focusing on the plain English parts.

To reviewers: Let me know how you feel reading it assuming no prior knowledge
with the implementation. Ways to make sentences shorter, natural to native
speakers without losing important information are also very welcome.

Reviewed By: DurhamG

Differential Revision: D10250141

fbshipit-source-id: 35258c7197c1ce0a1d3d0554fab2f2d2866e123c
2018-10-29 21:02:41 -07:00
Jun Wu
67ff256aa2 indexedlog: revise crate-level document and visibility of modules
Summary:
Make important modules public. Make internal utility (base16) private.  Add
some text to the crate-level document. It just refers to important structures.
Will revise document of those structures.

Reviewed By: DurhamG, kulshrax

Differential Revision: D10250143

fbshipit-source-id: c79859ee7d3d9cc4ee9a093ef5d12ec6599f2a42
2018-10-29 21:02:41 -07:00
Mark Thomas
93a98afbe4 vlqencoding: don't require Sized for Read or Write traits
Summary:
The `VLQEncode` and `VLQDecode` traits erroneously expected the (automatic)
`Sized` marker trait for `Read` and `Write`.  This meant they couldn't be used
for trait object `Read`s or `Write`s without jumping through hoops or extra
`mut` keywords.

By not requiring `Sized` we can remove those workarounds.

Reviewed By: quark-zju

Differential Revision: D12816459

fbshipit-source-id: 16353e8fefff5738bd24a9f41c9d7d250aea56fd
2018-10-29 04:10:46 -07:00
Mark Thomas
8c076978ff revisionstore: handle truncated packfiles better
Summary:
If the rust pack stores are used to access truncated pack files, currently they
panic.  Instead, return a proper error showing what's wrong.

Reviewed By: quark-zju

Differential Revision: D10868299

fbshipit-source-id: 57fe5ec1ee4ee2a7bb10d2d5c5ca7082dc34125d
2018-10-27 08:58:24 -07:00
Jun Wu
a8fb0739b5 configparser: port mercurial's configlist to Rust
Summary:
The `configlist` function converts a config value to a list of strings.

I have thought about using pest to parse it. However, pest might return errors
(ex. `a,",b` does not parse due to missing end quote), while the original logic
can happily parse everything (`a,",b` gets parsed into `['a', '"', 'b']`).

The code might be simplified to make it more obvious that `unwrap()` cannot
panic. But it handles so many corner cases that I'd like to port as-is for
correctness.

Reviewed By: DurhamG

Differential Revision: D9323743

fbshipit-source-id: 5f8be562b7437260b7551d87d751424558d76e8f
2018-10-26 21:06:18 -07:00
Jun Wu
3adc813687 codemod: add copyright headers
Summary: This is just the result of running `./contrib/fix-code.py $(hg files .)`

Reviewed By: ikostia

Differential Revision: D10213075

fbshipit-source-id: 88577c9b9588a5b44fcf1fe6f0082815dfeb363a
2018-10-26 15:09:12 -07:00
Durham Goode
79a60403f7 histpack: sort history entries before writing them
Summary:
The histpack format requires that entries in each file section be
written in topological order, so that future readers can compute ancestors by
just linearly scanning. Let's make the rust mutable history pack support this.

Technically the rust historypack reader does not require this for now, but the python
one does, so we need to enforce it.

Reviewed By: kulshrax

Differential Revision: D10441286

fbshipit-source-id: dfdb57182909270b760bd79a100873aa3903a2a5
2018-10-23 17:16:01 -07:00
Mateusz Kwapich
cf5b8e3815 argparse: sync from scm/telemetry/
Summary:
I've first forked argparse in May but I didn't end up making many chnages to
it. I want to start over with the current state.

Reviewed By: wez

Differential Revision: D10378110

fbshipit-source-id: 7d4220d79a527c16cfcf2f199f19c0c2f417a7ab
2018-10-22 08:29:51 -07:00
Durham Goode
3f06e4734e histpack: fix exponential time bug in rust history pack
Summary:
During an ancestor traversal, we were adding items to the queue if they
hadn't be processed yet. In a highly merge-y history this could result in adding
an exponential number of items to the queue since we aren't preventing items
from being added until they are actually consumed.

The fix is to just add the items to the seen set as we add them to the queue.

Reviewed By: quark-zju

Differential Revision: D10434655

fbshipit-source-id: 430b51adb2d24a99d8c780031f3dbf22c56b9347
2018-10-17 15:00:21 -07:00
Wez Furlong
4797273765 hg: datapack: avoid stack overflow for empty files with no deltas
Summary:
noticed this while trying to load the blob for `fbcode/eden/AUTODEPS`;
we'd stack overflow in here because mpatch_fold wouldn't terminate.

Looking at the code in `scm/hg/hgext/extlib/cstore/uniondatapackstore.cpp`,
there is logic to short circuit when there are no deltas, and throwin that
in here seems to do the right thing

Reviewed By: quark-zju, ikostia

Differential Revision: D10351279

fbshipit-source-id: 0d340e506fbad2ef056d0b51c474287babf527ce
2018-10-16 10:47:57 -07:00
Kostia Balytskyi
f2b8b4571f hgpython: rename hgenv to be buildenv
Summary: As per quark-zju's request in the earlier diff.

Reviewed By: quark-zju

Differential Revision: D10173168

fbshipit-source-id: 20ab1fbc597b8329bbfec5dabd501d202571bdec
2018-10-12 14:55:09 -07:00
Kostia Balytskyi
0946205e68 hgmain/hgpython: add copyright headers
Reviewed By: farnz

Differential Revision: D10145635

fbshipit-source-id: 0d88c18a44a86a8eb19f40ddba0c13f9570f3a76
2018-10-12 14:55:09 -07:00
Kostia Balytskyi
682e4bed1a hgpython: extract hgpython from hgmain
Summary:
Following the conversation with quark-zju, this in future will help us conditionally dynamically load
the `hgpython` `.dll`/`.so` only if we need it.

Reviewed By: quark-zju

Differential Revision: D10084949

fbshipit-source-id: c20ef014ad9922913ee36d1ec28b0555b64f7d1f
2018-10-12 14:55:09 -07:00
Saurabh Singh
1ffa44eae3 zstd-sys: update the package version
Summary: The old version cannot be found and its making the build fail.

Reviewed By: markbt

Differential Revision: D10255834

fbshipit-source-id: d14572885423622ecfe3730bbda07ae1bee7363a
2018-10-09 07:26:31 -07:00
Lukas Piatkowski
d01ebe2166 rust-crates-io: add crossbeam to tp2
Reviewed By: ikostia

Differential Revision: D10244968

fbshipit-source-id: 8d06bb64b6a1227ae589caf0588a1f3657603ce9
2018-10-08 21:32:00 -07:00
Saurabh Singh
2def7c19e2 packaging: back out D10213071 to fix continuous build
Summary: D10213071 broke the continuous build. Therefore, backing it out.

Reviewed By: ikostia

Differential Revision: D10238353

fbshipit-source-id: 0b387f6dd802614112cdc969944cbe4c40582b3d
2018-10-08 08:54:08 -07:00
Jun Wu
1cde64ae27 rustlib: move Cargo.toml to top-level
Summary:
This makes all crates' cache shared and unifies Cargo.lock, which
is used by the next diff.

Reviewed By: ikostia

Differential Revision: D10213071

fbshipit-source-id: 48a979c41423a8e8a9795ff102646cce13c39ff4
2018-10-05 16:43:47 -07:00
Jun Wu
100c360e54 indexedlog: mark block as non-code
Summary:
The code block is not a valid Rust program. Mark it as "plain".
This fixes `cargo doc`.

Reviewed By: markbt

Differential Revision: D10137806

fbshipit-source-id: 1197d3a2ebc1450a0738686fa6cfa7c7b79dcb0d
2018-10-03 18:19:27 -07:00
Jun Wu
7752e9e81f rustlib: move Node to a separate "types" crate
Summary:
The `Node` type will be used in multiple places. Let's move it to a standalone
crate so new libraries depending on it won't need to pull in all of
revisionstore's dependencies.

Note: I'd also like the `types` create to only define clean types. Given the
fact NULL_ID is not a great design in Mercurial (`Option<Node>` is a better
choice in Rust), it probably does not belong to the formal Rust `Node` type.
This diff is merely about moving things with minimal changes. NULL_ID will
be decoupled from `Node` in a follow-up.

Reviewed By: markbt

Differential Revision: D10132047

fbshipit-source-id: 5d05c5e0ac06a2d58556c4db11775503f9495626
2018-10-03 18:19:27 -07:00
Jun Wu
a4434458e6 configparser: fix "%include /foo" on Windows
Summary:
Before this patch, `%include` support on Windows is:

  # Works fine - UNC path: `\\?\c:\1.rc`.
  %include c:\1.rc

  # Works fine - UNC path: `\\?\c:\1.rc`.
  %include \1.rc

  # Works fine - UNC path: `\\?\c:\1.rc`.
  %include c:/1.rc

  # Bad - UNC path: `\\?\c:/1.rc`.
  %include /1.rc

People expect `%include /1.rc` to work on Windows. Fix it by normalizing
the path in `%include` handling.

More context:
Normally, `/` and `\` can be used interchangeably on Windows. But it's not true
for UNC paths. The config parser uses `std::fs::canonicalize` to normalize
paths.  The following Python script demonstrates the difference:

  >>> import os
  >>> open('c:\\1.rc').close()
  >>> os.path.exists('\\\\?\\c:\\1.rc')
  True
  >>> os.path.exists('\\\\?\\c:/1.rc')
  False

Reviewed By: phillco

Differential Revision: D10036882

fbshipit-source-id: fd85e0bc86d1e5776701077751ac875e71d60568
2018-09-25 13:21:43 -07:00
Jun Wu
fe900beb40 profiling: move $HGPROF handling to configparser
Summary:
It's cleaner for the config parser to take care of environment variable
handling.

A side effect of this change is, `$HGPROF` only affects `profiling.type`,
not `profiling:foo.type`, which is more desirable since we don't want
`profiling:foo.type` to be overridden by `$HGPROF`.

Reviewed By: markbt

Differential Revision: D9828547

fbshipit-source-id: 27be3683beee60a4eee6040ca1b4160dc1a89f73
2018-09-21 14:37:23 -07:00
Harvey Hunt
70a0c74d3b Implement a bookmark store for managing mercurial bookmarks
Summary:
Create a storage object that can be used to load bookmarks from a
mercurial file, modify and query the bookmarks in memory and then write back
to a mercurial bookmark file.

Reviewed By: quark-zju

Differential Revision: D9768564

fbshipit-source-id: ed469d0e588ae2200d614bf62a5a0b577e7c6f74
2018-09-20 05:05:08 -07:00
Harvey Hunt
c507d4e818 Implement Display trait for revisionstore Node
Summary:
Copy functions from Mononoke to implement the Display trait
for a Node.

Reviewed By: quark-zju

Differential Revision: D9768566

fbshipit-source-id: 6961026a9e4cdaf4a0f2592dc9284abebadb0aa3
2018-09-20 05:05:08 -07:00
Jun Wu
b212efa921 configparser: preserve leading new-lines
Summary:
Preserve leading (but not tailing) new lines so the config (where `_` denotes a
space):

  x_=__
  __Foo
  __

is parsed as `"\nFoo"`.

This is useful in template configs.

Reviewed By: ryanmce

Differential Revision: D9929764

fbshipit-source-id: e30659df94937c7c2121627f42ea425191003fb1
2018-09-19 11:54:22 -07:00
Jun Wu
acc7039436 configparser: permit spaces in more cases
Summary:
Be more permissive about spaces. Namely:
- Spaces after a section name like `[foo]    ` are allowed.
- Spaces in config names are allowed.
- Spaces at trailing lines are ignored and no longer insert an `\n` to the previous config.

This makes it closer to the older config parser behavior. But it's still
different on some cases, like `[foo]]`, `[foo] # bar`, `[foo]]` still do not
parse.

Benchmark shows no obvious (within 10%) slowdown. So this is probably fine.

Reviewed By: strager

Differential Revision: D9620253

fbshipit-source-id: 8489ef8e83606d0557db56e8da0a017d55ff1514
2018-09-12 12:05:32 -07:00
Liubov Dmitrieva
32cccf30f2 commit cloud sync: fast path for pull
Summary:
Maybe useful as a backup for the regular path and also for syncing speed up.

Scm daemon know new and removed heads, so if for example 1 new and 1 removed head - it is the most probably just an amend, so scm daemon can try the fast path first depends on information in the notification, and if it fails try the slow path.

So users can have better experience before Mononoke, it is much much faster and scm daemon makes 2 attempts anyway!

Reviewed By: quark-zju

Differential Revision: D9309856

fbshipit-source-id: d59f498160a45fab11760b5c1397b48470feb7f8
2018-09-10 15:05:25 -07:00
Jun Wu
43bda98976 configparser: add a benchmark parsing large files
Summary: This would provide information about performance changes.

Reviewed By: singhsrb

Differential Revision: D9620252

fbshipit-source-id: 51d243b50b349c63e552bd1c43db17497025f73a
2018-09-07 16:56:38 -07:00
Kostia Balytskyi
59e00ccf47 hg: add some convenient panicking conversion to encoding
Summary:
Local bytes `&[u8]` or `Vec<u8>` is frequently wrapped into a `CString`,
because `CString` includes a trailing 0. Let's add a helper for that.

Reviewed By: quark-zju

Differential Revision: D9482435

fbshipit-source-id: 096ba725d83acc9c5fc1fe836dce509fe36e49e9
2018-08-30 04:42:11 -07:00
Kostia Balytskyi
8693c2d67b hg: improve encoding::path_to_local_bytes on Windows
Summary:
This improvement avoids an extra conversion to String (which can fail if
something cannot be encoded as UTF8).

Reviewed By: quark-zju

Differential Revision: D9447823

fbshipit-source-id: fa13ff9b833cc4edf9f5dc518b3f8712518c97fd
2018-08-30 02:51:13 -07:00
Jun Wu
d38749a9fd configparser: expand environment variables in HGRCPATH
Summary:
liubov-dmitrieva encountered an issue where her home hgrc is not loaded. That's because
environment variables in HGRCPATH are not expanded. Fix it by calling
`expand_path` on the paths.

Reviewed By: phillco

Differential Revision: D9499239

fbshipit-source-id: cd4b7a26fd12f1c3148a21dbb5584bbeb3885286
2018-08-28 19:51:12 -07:00
Kostia Balytskyi
9157270708 hg: fix the system config locations for Windows in Rust configparser
Summary:
Orignally both `mercurial.ini` and `hgrc.d` were looked up in the same location
as main Mercurial executable, not in the `datadir`. See from `scmwindows.py`:
```
    filename = util.executablepath()
    # Use mercurial.ini found in directory with hg.exe
    progrc = os.path.join(os.path.dirname(filename), "mercurial.ini")
    rcpath.append(progrc)
    # Use hgrc.d found in directory with hg.exe
    progrcd = os.path.join(os.path.dirname(filename), "hgrc.d")

```

Reviewed By: quark-zju

Differential Revision: D9540052

fbshipit-source-id: d5921193dd14fcb46cf428aaa77d26a58aef7868
2018-08-28 11:27:50 -07:00
Jun Wu
8356dbd506 configparser: fix windows EOL handling
Summary:
Without this patch, all hg commands will fail with our current config:

  hg: parse error: <filename>
   --> 6:5
    |
  6 |     commandexception
    |     ^---
    |
    = expected new_line

The config is:

  [blackbox]
  track = command
      commandexception
      ...

Because "\r\n" was treated as the same as double "\n"s.

Reviewed By: ryanmce

Differential Revision: D9494909

fbshipit-source-id: 64ef173c69f3cf61d4e71116c581dbca72fb2c4b
2018-08-24 00:05:34 -07:00
Kostia Balytskyi
819d195bc8 hg: add some platform-specific tests for encoding crate
Summary:
This just adds some more tests that in case of Windows try to execute the
string encoding APIs directly, while paying attention to the ANSI Code Page.

Reviewed By: quark-zju

Differential Revision: D9441406

fbshipit-source-id: c0873dca9fc8775839a62da60af46ff29e700634
2018-08-22 09:06:22 -07:00
Kostia Balytskyi
71baca5dfa hg: add osstring_to_local_bytes function to the encoding crate
Summary: Another convenience method that I plan to use in the `hgmain` later.

Reviewed By: quark-zju

Differential Revision: D9441426

fbshipit-source-id: 007e4932a344b9d1c8d4d654152bcca5c2362431
2018-08-22 09:06:22 -07:00
Kostia Balytskyi
1175b6b1c6 hg: extract platform-specific bits of encoding into separate files
Summary:
I think it's more readable to split the implementations into platform-specific
bits.

Reviewed By: quark-zju

Differential Revision: D9441424

fbshipit-source-id: 136d5a00aa4ed8cf4f0886bda0f77a40cba1f542
2018-08-22 09:06:21 -07:00
Kostia Balytskyi
fa882cf7da hg: make encoding create use ANSI code page, not OEM
Summary:
We almost never need an `OEM` code page: Windows API calls use ANSI-encoded
strings if they are `A` calls and Wide strings if they are `W` calls.

Reviewed By: quark-zju

Differential Revision: D9441425

fbshipit-source-id: 979697c349389ea4f7569be9949be3b636f6063c
2018-08-22 09:06:21 -07:00
Kostia Balytskyi
25a8ee686f hg: rename pathencoding into encoding
Summary:
In the later diffs I'll add some more functionality there, not strictly
related to encoding paths.

Reviewed By: quark-zju

Differential Revision: D9441427

fbshipit-source-id: 069ab30a24761038fa2c1a4f180bbc0699d38ef9
2018-08-22 09:06:20 -07:00
Puneet Kaushik
9b9126c79f Started Eden for Windows and integrated hg store with it.
Summary:
This diff is first in the series to make Eden work on Windows. It includes:

1. HG backing store and Object store, which provides the capability to talk to mercurial and fetch the file and folder contents on Windows.
2. Subprocess and Pipe definition for Windows.
3. The Visual studio solution and projects files to compile Eden and scm datapack.

Few Important points:

1. Most of the changes to existing code is done under a macro EDEN_WIN so that it doesn't impact on other platform.
2. Sqlite is used for caching the fetched contents. We are not using Rocksdb on Windows.
3. The main function only calls some test code and exit after printing the output.
4. The initializeMononoke code is disabled for Windows because it needs Proxygen to talk HTTP. Will enable this once I get Proxygen and other dependencies working.
5. HgImporter pass Windows handles to hg_import_helper as command line args. The code to convert these handles into fds is in a separate diff.

Reviewed By: wez

Differential Revision: D8653992

fbshipit-source-id: 52a3c3750425fb92c2a7158c2c214a9372661e13
2018-08-21 17:51:26 -07:00
Durham Goode
a0d0a75c44 revisionstore: add unit tests for ancestor logic
Summary:
This was meant to be in a prior diff but was forgotten. This also
exposes an issue where we aren't producing ancestors in topological order.

Reviewed By: quark-zju

Differential Revision: D9380009

fbshipit-source-id: 6a49f0f31c3e107353f9192ca15cda0b1b9c3693
2018-08-17 12:49:57 -07:00
Jun Wu
284c2f5ccb configparser: move some features to hg module
Summary:
The config remapping, whitelisting features are hg specific. And is done by
using `append_filter` API exposed by `config.rs`. They are more of "extended
features". So move them to `hg.rs`.

Reviewed By: DurhamG

Differential Revision: D9323789

fbshipit-source-id: 89bc4416ee7276c2d1d4db8eba6404747cbb4ec4
2018-08-17 12:21:31 -07:00
Jun Wu
36c4ce233d config: be compatible with Windows-style environment variables
Summary:
On Windows `%include` can have paths containing environment variables like
`%PROGRAMDATA%`. We already ship that kind of config files to users therefore
let's add support for that.

The change assumes `%` is not used as part of a normal path, which is probably
good enough for practical uses. If `%` does need to be legally used in a
filename, we can add escaping support later.

Reviewed By: DurhamG

Differential Revision: D9283303

fbshipit-source-id: bcc80307fe19dfc40aea88b6a0a5f69681e835fc
2018-08-17 11:50:37 -07:00
Durham Goode
6dfc0351f4 revisionstore: don't allow loading non-v1 historypacks
Summary:
v0 history packs require more complicated and slow logic for looking up
a node.  Instead of complicating our rust implementation, let's just not support
v0.

Reviewed By: quark-zju

Differential Revision: D9373395

fbshipit-source-id: 6d28a3684966b55a617619e3cae765b2944919a0
2018-08-17 09:39:36 -07:00
Durham Goode
58b15fd23c revisionstore: make get_ancestors return an error if it can't find the key
Summary:
When calling get_ancestors with 'partial' enabled, we want to return a
key error if the first key can't be found, but not if later keys can't be found.

Reviewed By: singhsrb

Differential Revision: D9367477

fbshipit-source-id: 0e9ad7ea82f83db7326392accab96bd31318f28e
2018-08-17 09:39:36 -07:00
Durham Goode
6cd8838c3c revisionstore: change HistoryIndex to accept references to filenames
Summary:
Previously HistoryIndex.write() accepted a vector and a hashmap that
contained Box<[u8]>. This diff changes it to be &Box<[u8]>, which allows us to
avoid a ton of allocations.

Reviewed By: quark-zju

Differential Revision: D9350962

fbshipit-source-id: 3f900c551584e3431202f3a30afd61aa10fbb78c
2018-08-16 12:34:52 -07:00
Durham Goode
94c6ad00ef revisionstore: remove into_boxed_slice usage
Summary:
I learned that Box::from() can be used to copy a slice into a box, so
let's replace my previous to_vec().into_boxed_slice() with this.

Reviewed By: quark-zju

Differential Revision: D9350961

fbshipit-source-id: 94053b82cd64923dfabc9acf3a9dab6daca20cf3
2018-08-16 12:34:52 -07:00
Durham Goode
37c830f836 fanouttable: make locations an Option
Summary:
Only the dataindex needs the actual locations, so let's make the
locations vector optional and only pass it from dataindex.

Based on feedback from an earlier code review.

Reviewed By: quark-zju

Differential Revision: D9350960

fbshipit-source-id: 54ec34e1bd891ae585b22d916664700ce5417353
2018-08-16 12:34:52 -07:00
Durham Goode
f71eb28587 revisionstore: fix build break on older Rust
Summary: Apparently older rust needs a & on these match statements.

Reviewed By: phillco, quark-zju

Differential Revision: D9363026

fbshipit-source-id: fa802464d01b4074546076888e6d5c92155ddf4e
2018-08-16 11:05:25 -07:00
Durham Goode
5283a0ca54 revisionstore: add SliceExt for receiving Result<> from slice reads
Summary:
Rust doesn't provide a convienent way to do `slice[range]?`, so let's
introduce an extension trait for allowing slice range reads and getting a Result
back.

Reviewed By: markbt

Differential Revision: D9276216

fbshipit-source-id: 9a8cea8ffc062c4a2dd432dd4de7fdd4ccabf8d3
2018-08-15 15:24:40 -07:00
Durham Goode
70c243ffcb pyrevisionstore: implement Repackable for HistoryPack
Summary:
Before we can finish the python bindings for HistoryPack, we need to
implement the Repackable trait.

Reviewed By: markbt

Differential Revision: D9273264

fbshipit-source-id: ed181d73c497a84fed5e0c85fad1d7d73ec52e4e
2018-08-15 15:24:40 -07:00
Durham Goode
489fc12598 historypack: implement get_ancestors
Summary: Implements the last HistoryStore api.

Reviewed By: markbt

Differential Revision: D9231388

fbshipit-source-id: 061df3f89c4abadad1e1abcf002bb38d9069ef6e
2018-08-15 15:24:39 -07:00
Durham Goode
4ac8d1cf0c historystore: add traversal options for ancestor traversals
Summary:
Previously ancestor traversals would return an error if they
encountered a node that couldn't be resolved. In some cases we want to support
partial ancestor resolutions (like iterating over part of a history in one
store, while the rest in another store). Let's add an option that decides
whether a missing node is an error or not.

Reviewed By: markbt

Differential Revision: D9231397

fbshipit-source-id: ff3063acfb8da2d453f34221f1865f3123615b0c
2018-08-15 15:24:39 -07:00
Durham Goode
faa6b7a152 historypack: implement get_missing
Summary: Implements another piece of the HistoryStore API.

Reviewed By: markbt

Differential Revision: D9231396

fbshipit-source-id: b2949396eab999c46cf1a2c24fa7a329a5971682
2018-08-15 15:24:39 -07:00
Durham Goode
e41732f1e7 historypack: implement HistoryStore::get_node_info
Summary: Implements the first of our HistoryStore APIs.

Reviewed By: markbt

Differential Revision: D9231390

fbshipit-source-id: 72e43819b4f0ad6e0aa80c6793ac28d51c1435ef
2018-08-15 15:24:39 -07:00
Durham Goode
7a459efe56 historypack: initial HistoryPack boiler plate
Summary:
Now that MutableHistoryPack and HistoryIndex are implemented, we can
put the final HistoryPack code in place. Let's start by putting the boiler plate
definition down.

Reviewed By: markbt

Differential Revision: D9231387

fbshipit-source-id: d90f20603a7e08becda604eeda90d62ef4e88cbb
2018-08-15 15:24:39 -07:00
Durham Goode
a54880cf6a historypack: make mutablehistorypack write to disk
Summary:
Now that we have serializers for all the individual parts, let's make mutablehistorypack actually serialize.

This is still missing the bit where we topologically sort the nodes before writing them, so that will come later.

Reviewed By: markbt

Differential Revision: D9231401

fbshipit-source-id: 85703a44420bd9eee80fabe1fa4ffb0ebff3ecfd
2018-08-15 15:24:39 -07:00
Durham Goode
5e01141c7b historypack: implement HistoryEntry serialization logic
Summary:
In an upcoming diff we'll begin writing the actual history data pack
file. It is primarily composed of HistoryEntry's so let's implement and unit
test the serialization logic for them.

Reviewed By: markbt

Differential Revision: D9231391

fbshipit-source-id: 1b070ee5d15e06ea0c70a678dc6eb129e0ffaa20
2018-08-15 15:24:39 -07:00
Durham Goode
6158355747 historypack: add serialization logic for FileSectionHeaders
Summary:
In an upcoming diff we'll start writing the actual history pack data
file. The file section headers are part of that file, so let's implement the
serialization logic and unit test it.

Reviewed By: markbt

Differential Revision: D9231400

fbshipit-source-id: eb1494e3b8fe3419f77edcaab25f640b48f16e4b
2018-08-15 15:24:39 -07:00
Durham Goode
1d19243ea1 datapack: change ok_or to ok_or_else
Summary:
The old ok_or would allocate a string every time. Since this is an
error condition, let's use ok_or_else to only allocate the string if there's an
error.

Reviewed By: markbt

Differential Revision: D9231394

fbshipit-source-id: e4912bfd26925077fffb7c686a7c1c2f3cb36f7c
2018-08-15 15:24:39 -07:00
Durham Goode
297cfe7625 historyindex: add node entry lookup
Summary: Add an API for reading individual nodes from an index.

Reviewed By: markbt

Differential Revision: D9231398

fbshipit-source-id: 1d7c4b34b121cba62bc28eba2323807cfedbaf3b
2018-08-15 15:24:39 -07:00
Durham Goode
176881503f historyindex: add file entry lookup
Summary:
Now that historyindex can serialize, let's add logic to perform file
entry lookups. A later diff will allow node looks.

Reviewed By: markbt

Differential Revision: D9231392

fbshipit-source-id: 9ab8e29ce85c0f372a7a432318d6d903e6c44bcc
2018-08-15 15:24:38 -07:00
Durham Goode
9cdb8d9e42 historyindex: write the actual index
Summary:
Adds logic to write the actual index, including the fanout table, the
file name index, and the node indexes.

Reviewed By: markbt

Differential Revision: D9231402

fbshipit-source-id: f382c4a56c5c53b83232b43adb966a7aff3878db
2018-08-15 15:24:38 -07:00
Durham Goode
36ab87c75f historyindex: add initial reader/writer
Summary:
Adds the initial reader/writer boiler plate for creating a HistoryIndex
and writing the header.

Reviewed By: markbt

Differential Revision: D9231389

fbshipit-source-id: ece1290416e8cde23a825ee3bd1a555a4ebded35
2018-08-15 15:24:38 -07:00
Durham Goode
a886ff48c2 historypack: add read/write for HistoryPackOptions
Summary: Adds reader/writer logic for the history pack options header.

Reviewed By: markbt

Differential Revision: D9231393

fbshipit-source-id: 384846e7e46a6488dc5a281210a05376d3a8dbb8
2018-08-15 15:24:38 -07:00
Durham Goode
68e2cdfb3e historypack: implement NodeIndexEntry read/write logic for historyindex
Summary:
To start the history pack implementation, let's start by implementing
reader/writers for the various parts. In this diff we do the NodeIndexEntry.

Reviewed By: markbt

Differential Revision: D9231403

fbshipit-source-id: 904c1a094e63b0f4cebef84a30a7dd89bdaf1e1f
2018-08-15 15:24:38 -07:00
Durham Goode
694cc78523 historypack: implement FileIndexEntry read/write logic for historyindex
Summary:
To start the history pack implementation, let's start by implementing
reader/writers for the various parts. In this diff we do the FileIndexEntry

Reviewed By: markbt

Differential Revision: D9231395

fbshipit-source-id: d054959796ee4e3d51df8f3533712f8f959a04d2
2018-08-15 15:24:38 -07:00
Durham Goode
3b5966895f mutablehistorypack: implement get_ancestors
Summary: Implements the HistoryStore get_ancestors api.

Reviewed By: quark-zju

Differential Revision: D9136980

fbshipit-source-id: 59b7a1d51c4bf95edec452fcb912fb7647151d24
2018-08-15 15:24:38 -07:00
Durham Goode
c8d9db34df revisionstore: add AncestorIterator class
Summary:
This moves the ancestor iteration logic for cases where we iterate one
by one. This will be used by the HistoryPack code in upcoming diffs.

Reviewed By: quark-zju

Differential Revision: D9136978

fbshipit-source-id: e60b0a1e2ee5036938b51bbd910fbaf548d7aa75
2018-08-15 15:24:38 -07:00
Durham Goode
550d912ae0 revisionstore: add BatchedAncestorIterator class
Summary:
This moves the ancestor iteration logic into it's own class, with
support for cases where we receive bulk sets of ancestors at once. A future diff
will add similar logic for ancestor traversals where we receive one hash at a
time.

Reviewed By: quark-zju

Differential Revision: D9136985

fbshipit-source-id: 7f918476f777020b3436f5104ad3bf4b00fe9827
2018-08-15 15:24:38 -07:00
Durham Goode
d45da0c3aa mutablehistorypack: implement get_node_info
Summary: Implements get_node_info in the HistoryStore trait.

Reviewed By: quark-zju

Differential Revision: D9137007

fbshipit-source-id: e98b5ed247b5756074902a155fd31eeff8e176d8
2018-08-15 15:24:38 -07:00
Durham Goode
8e6bf4f8f5 mutablehistorypack: implement get_missing()
Summary: Initial implementation for get_missing on MutableHistoryPack

Reviewed By: quark-zju

Differential Revision: D9136983

fbshipit-source-id: ea6c7a7a513d9ef8f2c06a1e6601109fc6e9ebce
2018-08-15 15:24:38 -07:00
Durham Goode
1ac2e4a88b mutablehistorypack: implement add()
Summary:
The initial code for implementing a mutable history pack. Future diffs
will add logic that serializes this to a pack file, an index file, and adds
a history pack reader class.

Reviewed By: quark-zju

Differential Revision: D9136997

fbshipit-source-id: 7e80613eb4cc0cb51a977a4a449d565ab1d0ce80
2018-08-15 15:24:38 -07:00
Durham Goode
01b81f1217 tests: remove cargo warnings
Summary: Fix up some simple cargo test warnings

Reviewed By: quark-zju

Differential Revision: D9136979

fbshipit-source-id: dd10ea6751eb68190190381dc69b3494160cf358
2018-08-15 15:24:38 -07:00
Jun Wu
f85675d80c configparser: record file content in ValueSource struct
Summary:
Embed a snapshot of the config file at the parsing time. So applications can
have access to them, and can do things like calculating line numbers, or editing
the config files.

This is shallow copy. So it does not affect performance.

Reviewed By: DurhamG

Differential Revision: D8960872

fbshipit-source-id: e1905712dbec4b02d93a4fecc97064f0e00024c8
2018-08-09 21:21:49 -07:00
Jun Wu
134541a7ae configparser: make load_system and load_user return errors
Summary: Otherwise there is no way to get parse errors with the last change.

Reviewed By: DurhamG

Differential Revision: D8960867

fbshipit-source-id: 48ef748096a67baa155bddf202c8ebec7ed1eeb5
2018-08-09 21:21:49 -07:00
Jun Wu
cb58cd0d26 configparser: return errors instead of keeping them
Summary:
Change the API to return parse errors directly, instead of keeping them in
ConfigSet struct. This makes it easier to get errors related to one of the
"parse" calls.

Reviewed By: DurhamG

Differential Revision: D8960869

fbshipit-source-id: fbd571f264415e788c5ac44961149d1498826a6d
2018-08-09 21:21:49 -07:00
Jun Wu
d028811a8f configparser: strip leading space from multi-line value
Summary:
Multiline value like:

  [section]
  foo = a
    b

should be parsed as "a\nb", instead of "a \nb".

It does not affect configlist, but affects template definations.

Unfortunately in this case we had to allocate a new buffer instead of using
`Bytes::slice`. Fortunately most configs are single-line, so the performance
impact is hardly visible practically.

Reviewed By: DurhamG

Differential Revision: D8960866

fbshipit-source-id: 011e7f431d682236529ce176fe577aac6a010d91
2018-08-09 21:21:48 -07:00
Jun Wu
28b42961e8 configparser: use indexmap
Summary:
Switch to indexmap, which is more actively maintained than linked-hash-map.

There is no visible performance difference when parsing large config files.

Reviewed By: DurhamG

Differential Revision: D8960870

fbshipit-source-id: 8d6650e2d8b14989061dceb2081a3f93004cea76
2018-08-09 21:21:48 -07:00
Jun Wu
3bfce55697 configparser: use dirs crate
Summary:
`home_dir` in stdlib is going to be deprecated. Therefore switch to
external crate.

Reviewed By: DurhamG

Differential Revision: D8960874

fbshipit-source-id: e123debc5c58e6a632a801dedcd9fc6834cb1f65
2018-08-09 21:21:48 -07:00
Jun Wu
8aefad1c97 configparser: use shellexpand crate to expand paths
Summary: The crate also helps expanding environment variables.

Reviewed By: DurhamG

Differential Revision: D8960873

fbshipit-source-id: c83fc7256a8297752a14c1d86d1ddb3735f95682
2018-08-09 21:21:48 -07:00
Jun Wu
a4129f8d53 configparser: use pest to parse config files
Summary:
[pest](https://github.com/pest-parser/pest) is an elegant Rust library for
parsing text.

A navie benchmark on a 1MB config file shows pest is about 1.5 to 2x slower.
But the better error message and cleaner code seems worth it.

Practically, in a VirtualBox VM, parsing a set of our config files takes 3-7ms.
The overhead seems to be opening too many files. Reducing it to one file makes
parsing complete in 2-4ms.

Unfortunately the buck build has issues with the elegant syntax
`#[grammar = "spec.pest"]`, because "spec.pest" cannot be located by pest_derive.
Therefore a workaround is used to generate the parser.

The motivation behind this is because I noticed multi-line value can not be
taken as a plain Bytes slice. For example:

  [section]
  foo = line1
    line2

"foo" should be "line1\nline2", instead of "line1\n  line2". It does not make a
difference on configlist. But it affects templates. Rather than making the
parser more complex, it seems better to just adopt a reasonbly fast parsing
library.

Reviewed By: DurhamG

Differential Revision: D8960876

fbshipit-source-id: 2fa04e38b706f7126008512732c9efa168f84cc7
2018-08-08 17:20:00 -07:00
Jun Wu
0ea7f4aa94 configparser: skip space characters
Summary:
Previously, a line with all space characters is considered "illegal" and I
didn't handle it. It would actually be parsed as part of config name.

Let's skip them. So config files with spaces would behave sanely.

Reviewed By: DurhamG

Differential Revision: D8887370

fbshipit-source-id: e55d221d281fc58b2d2efbcb9196e7f68a78d719
2018-08-08 17:20:00 -07:00
Jun Wu
4f6c9b1a5e configparser: add a way to clone configs
Summary: Mercurial's ui.py needs a way to copy configs.

Reviewed By: DurhamG

Differential Revision: D8886245

fbshipit-source-id: b936edf5e215ecae078d992a344bcecef7fcd7f3
2018-08-08 17:20:00 -07:00
Jun Wu
d1e4252154 configparser: add a way to mark configs as read-only
Summary:
Command-line flags override config files configs. However, config files load
after parsing command-line flags in Mercurial's current logic. Therefore, a way
to make sure config files do not override command line flags is needed.

`ui.py` uses two config objects `ui._ocfg`, `ui._ucfg` and calls
`_ucfg.update(_ocfg)` after loading a config file to solve the problem.
That adds overhead updating ucfg.

With configparser's "filter" API, instead of rewriting configs afterwards,
the configs can be stopped from loading via files in the first place. So
there is no overhead maintaining two config sets and updating them.

Reviewed By: DurhamG

Differential Revision: D8960877

fbshipit-source-id: cf7b9a820911638956e123c1c93d3febeabf53c2
2018-08-08 17:20:00 -07:00
Jun Wu
39a4651bc3 configparser: implement system and user config loading
Summary:
This is to locate system / user config files at some fixed places.
It will replace most of `mercurial.rcutil`, and be used in native clients.

Reviewed By: DurhamG

Differential Revision: D8895791

fbshipit-source-id: 47166b943a3bd90a8aff1c15674a3da4e14bf8d3
2018-08-08 17:20:00 -07:00
Jun Wu
5377ca5f97 configparser: implement HGPLAIN handling
Summary:
Implement HGPLAIN handling using the filter feature of `Options`.

This is hg-specific, therefore it's implemented in a separate `hg` module, as
an extension to `config::Options`.

The `hg` module could contain more hg related logic, like locating system
and user config files, to make it easier to use by Eden.

The plan is to have this as the single source of truth handling HGPLAIN
environment variables and migrate other places reading HGPLAIN to
use side effects caused by functions defined here. The side effects
are ideally just normal config options accessible via `ConfigSet::get` APIs,
instead of another special case (ex. `HgPlain::get(name) -> bool`).

Reviewed By: DurhamG

Differential Revision: D8895788

fbshipit-source-id: fa0ad7e7207513d947216292cbbd65530391cf11
2018-08-08 17:19:59 -07:00
Jun Wu
e33154698b Back out "Reuse pylz4 encoding between hg and Mononoke into a separate library"
Summary:
Backout D9124508.

This is actually more complex than it seems. It breaks non-buck build
everywhere:

- hgbuild on all platforms. POSIX platforms break because `hg archive` will
  miss `scm/common`. Windows build breaks because of symlink.
- `make local` on GitHub repo because `failure_ext` is not public. The `pylz4`
  Cargo.toml has missing dependencies.

Fixing them correctly seems non-trivial. Therefore let's backout the change to
unblock builds quickly.

The linter change is kept in case we'd like to try again in the future.

Reviewed By: simpkins

Differential Revision: D9225955

fbshipit-source-id: 4170a5f7664ac0f6aa78f3b32f61a09d65e19f63
2018-08-08 12:20:54 -07:00
Tuan Tran
f50d617d2d Reuse pylz4 encoding between hg and Mononoke into a separate library
Summary: Moved the lz4 compression code into a separate module in `scm/common/pylz4` and redirected code referencing the former two files to the new module

Reviewed By: quark-zju, mitrandir77

Differential Revision: D9124508

fbshipit-source-id: e4796cf36d16c3a8c60314c75f26ee942d2f9e65
2018-08-08 10:08:11 -07:00
Liubov Dmitrieva
d2be4f57d0 scm daemon: fix spelling
Reviewed By: markbt

Differential Revision: D9194901

fbshipit-source-id: 1f077a5778bee2d2b1f62b2d10beff3dd3365471
2018-08-07 06:39:45 -07:00
Jun Wu
7d346e6bc2 ignore: support global gitignore configs
Summary:
Change the Rust ignore matcher to accept an extra list of gitignore files.
Parse "git:" entries of "ui.ignore" to be git ignore files.

Reviewed By: DurhamG

Differential Revision: D8863905

fbshipit-source-id: 0cd5e29e01f01496ff61c81b89f7876202f18a98
2018-08-02 20:22:47 -07:00
Hugh Harris
99a1b993c3 Add secrets authentication for commitcloud in scm_daemon
Summary: If the daemon can't find the token file, it will try to read from secrets_tool on unix-like systems. Integrates well with people who have enabled the secrets_token option as their token file will have been deleted.

Reviewed By: liubov-dmitrieva

Differential Revision: D9029795

fbshipit-source-id: b364d9e8885ee0473b8d1effd6ee0b2e86a699f9
2018-08-02 12:06:47 -07:00
Jeremy Fitzhardinge
08f618a7f3 tp2/rust: rust-crates-io update
Summary: Update rust-crates-io. Small changes needed for failure 0.1.2 update.

Reviewed By: rahulg

Differential Revision: D9125235

fbshipit-source-id: fd98af065b54e207fcb2c3cfc9dd9a2d325cc6c8
2018-08-02 10:05:38 -07:00
Jun Wu
dafc189588 treestate: fix documentation about FilteredKeyCache
Summary: It's just a documentation fix.

Reviewed By: singhsrb

Differential Revision: D9110152

fbshipit-source-id: ce4065b7aad6fac05f4c27ef7d2569352cdd2633
2018-07-31 16:35:50 -07:00
Jun Wu
03a0270913 treestate: change getfiltered API to return all matched entries
Summary:
Case-folding could be more complex than what Mercurial currently handles.
Suppose the following paths are committed to a repo using a case-sensitive
filesystem:

  a/a/A
  a/A/a
  A/a/a

Then querying "a/a/a" with a "normpath" filter should ideally have access to
all the above paths.

Unfortunately, the API is changed to use copy instead of references, as it's
impossible to return multiple values borrowed from `&mut self`.

Changes are made on treestate Python land as well to use the new API.  This
solves issues about case-folding corner cases covered by test-eol.t and
test-casefolding.t.

Reviewed By: DurhamG

Differential Revision: D9092405

fbshipit-source-id: 49eb4511ff3c9e5400a522b37126e112c917d2d7
2018-07-31 13:49:35 -07:00
Jun Wu
d662a1e82e configparser: implement section remapping
Summary:
The feature is required by Mercurial config layer. It's used by hgweb and
some templater configs.

Reviewed By: DurhamG

Differential Revision: D8886246

fbshipit-source-id: 836fc255b821e6b6c50cf2a435837e9051e90a7d
2018-07-27 18:49:49 -07:00
Jun Wu
fd7791958a configparser: implement section whitelist option
Summary:
The Mercurial API allows setting a section whitelist when parsing configs.
Let's add such feature to the Rust config parser.

Reviewed By: StanislavGlebik

Differential Revision: D8886247

fbshipit-source-id: 981026b98962e065b536077012d7d1042d2ada91
2018-07-27 18:49:49 -07:00
Jun Wu
4f5c7ccc14 configparser: allow defining filter functions to rename section or discard configs
Summary:
There are some advanced config related requirements in Mercurial:
- Drop certain configs if certain HGPLAIN features are set.
- But, do not drop HGPLAIN configs if the config is set via CLI flags.
- Remap section names.
- Whitelist sections.

This diff adds a filter function option aiming to support all of the above.

Reviewed By: StanislavGlebik

Differential Revision: D8895787

fbshipit-source-id: 1abd90974c4e4b3f7f2fb33173ad2af34e0a4a65
2018-07-27 18:49:49 -07:00
Jun Wu
47094efdde configparser: move "source" to a dedicated "Options" struct
Summary:
It turns out that "source" is not the only "option" that the caller needs to
set. From Mercurial's existing code, namely `ui.readconfig`, the API also
needs to support whitelisting config sections, and "remap" config sections.

Instead of adding more parameters to almost all functions. Let's add an
`Options` struct that will holds those configs. For now, it only has
`source`. New fields will be added by upcoming changes.

To help existing code migrate smoothly, and satisfy the most common
use-cases where only "source" is set, a `From<impl Into<Bytes>>` trait is
implemented.

Reviewed By: StanislavGlebik

Differential Revision: D8886244

fbshipit-source-id: 90b49565de6fbbce3e8e48db8e6805154d156360
2018-07-27 18:49:49 -07:00
Durham Goode
2b272f7bbc revisionstore: use ok_or_else instead of ok_or
Summary:
When reading entries, we were using ok_or to read a slice and catch
errors, but this causes an unnecessary allocation for the error even if we don't
have an error. Let's use ok_or_else to avoid that.

Reviewed By: quark-zju

Differential Revision: D8897109

fbshipit-source-id: d308f64d54a58077d9ec2eb34dd1bef431ac1819
2018-07-26 12:17:20 -07:00
Durham Goode
6dc36e06d2 datapack: implement Repackable
Summary: Implements the Repackable interface.

Reviewed By: quark-zju

Differential Revision: D8895276

fbshipit-source-id: ba0c83894db283c5c1dddf68ec8fdbe64a17a801
2018-07-26 12:17:19 -07:00
Durham Goode
dfc30ad8e6 revisionstore: add Repackable trait
Summary:
Adds a trait that represents a store that is repackable. An implementor
only needs to be iterable, and expose some basic type and identifier information
and the trait provides the actual repack logic.

Reviewed By: quark-zju

Differential Revision: D8894756

fbshipit-source-id: 13053f8c7b6dca8b80ea819ef18949f3862cf367
2018-07-26 12:17:19 -07:00
Durham Goode
c18a1d04f1 revisionstore: implement iter for DataStore trait
Summary:
We need the ability to iterate over a datastore so we can implement
repack and cleanup. In a later diff we'll use this trait to implement repack
functionality in a way that it can apply to any store that implements
IterableStore.

Reviewed By: quark-zju

Differential Revision: D8885094

fbshipit-source-id: 0a2b1ab8cf524392d890302c33e386f1cd218d24
2018-07-26 12:17:19 -07:00
Durham Goode
e1b153825b revisionstore: store paths on DataPack object
Summary:
The paths for each data pack are used in various situations (repack,
error reporting, etc) so let's store them and make them accessible via the
python api.

Reviewed By: quark-zju

Differential Revision: D8884773

fbshipit-source-id: 4108c98b4e303ba9bded1f264746fa4a84845c73
2018-07-26 12:17:17 -07:00
Jeremy Fitzhardinge
03640e680e tp2: update rust-crates-io
Summary:
Fix crate names for where the crate name doesn't match the package
name. This affected a few crates, but in practice only rust-crypto/crypto was
used.

Reviewed By: Imxset21

Differential Revision: D9002131

fbshipit-source-id: d9591e4b6da9a00029054785b319a6584958f043
2018-07-25 15:50:52 -07:00
Durham Goode
e64f1c7c7f datapack: return KeyError for missing key
Summary:
If the DataIndex didn't have a key, we were returning a DataIndexError
when we should've been returning a KeyError. This tells the higher level stores
to continue to the next store instead of raising the exception further.

Reviewed By: quark-zju

Differential Revision: D8806186

fbshipit-source-id: c40da96101494d5e3ea7910bf4b1a89674463a77
2018-07-25 11:07:33 -07:00
Durham Goode
1f93a51285 datapack: change version to be an enum
Summary:
The version only has a few valid values, so let's change it to be an
enum. This will be used in an upcoming diff to make the python tests pass.

Reviewed By: quark-zju

Differential Revision: D8775752

fbshipit-source-id: b1101c123b4802fbcb0f0a6fe5a45d741aec764f
2018-07-25 11:07:32 -07:00
Durham Goode
1f260293cc datapack: fix indicator for end of chain vs missing delta
Summary:
In the python code, end of a delta chain is marked with one value while
a missing delta is marked with another value. This isn't actually used anywhere,
but let's make the rust code mimic this for now.

Reviewed By: quark-zju

Differential Revision: D8775039

fbshipit-source-id: c9f81471bfd67e720938d6c5bbd10db029406686
2018-07-25 11:07:32 -07:00
Durham Goode
c1baef6ce3 dataindex: hide IndexEntry fields behind methods
Summary:
In a future diff we'll be changing the storage of IndexEntry to be
different from the API. So let's hide the actual format behind functions.

Reviewed By: quark-zju

Differential Revision: D8923005

fbshipit-source-id: 2f87b35315f8a7a5a8e67b6d0be2c73a1d9bccb4
2018-07-25 11:07:32 -07:00
Durham Goode
82a6a73b98 datastore: add get_delta function to DataStore trait
Summary:
This function is present on the python data store api, so let's
replicate it here. Later we should come back and refactor this to be a special
case of the get_delta_chain result, but for now we'll maintain the custom API so
we can start using this code from python.

Reviewed By: quark-zju

Differential Revision: D8774474

fbshipit-source-id: aabcff3a43ae68859a1bf3b23f433214571b1a9d
2018-07-25 11:07:32 -07:00
Durham Goode
047bf26495 buck: add buck target files
Summary: This will let us build with buck.

Reviewed By: quark-zju

Differential Revision: D8980839

fbshipit-source-id: ea64328d32bc2c88984d0c861acefcc55b84ce02
2018-07-24 16:05:26 -07:00
Jun Wu
7e31ecff45 configparser: silence a compiler warning
Summary:
`std::env::home_dir` got deprecated [1]. But the replacements are not in tp2
yet (meaning the buck build will fail). So let's silence the warning for now.

As we're here, also fix an incorrect comment.

[1]: https://internals.rust-lang.org/t/deprecate-or-break-fix-std-env-home-dir/7315

Reviewed By: mitrandir77

Differential Revision: D8886248

fbshipit-source-id: aca0334cbc8b710e42c5c86c952f58adcd10ba2c
2018-07-23 18:37:10 -07:00
Durham Goode
6c71f5a3c0 revisionstore: fix unused code warnings
Summary:
There were a bunch of unused code warnings because the mutabledatapack
module wasn't exposed as public. This then lead to us ignoring other warnings.
Let's fix all of them.

Reviewed By: quark-zju

Differential Revision: D8895468

fbshipit-source-id: 914c81026469382fcf28015b4a6bce13bad746c2
2018-07-18 10:08:49 -07:00
Durham Goode
fb0e3537bf dataindex: fix dataindex to store index_start relative locations
Summary:
Previously the rust dataindex would store the delta base location as an
offset relative to the start of the file. The python implementation stores it
relative to the start of the index though. So let's update the rust
implementation.

Reviewed By: quark-zju

Differential Revision: D8774206

fbshipit-source-id: d4317a95df353a7b635f1827fcfad7f3fb171afd
2018-07-17 15:10:01 -07:00
Jun Wu
4089c7bd52 configparser: expose types
Summary: Exposes important types so they can be used in other crates.

Reviewed By: mitrandir77

Differential Revision: D8790923

fbshipit-source-id: 955249219ba5d963d0529ba35f79ed4a8120140a
2018-07-16 19:57:37 -07:00
Jun Wu
e94ffb1907 configparser: implement %unset and %include
Summary: Implement parsing those special macros.

Reviewed By: mitrandir77

Differential Revision: D8779053

fbshipit-source-id: 422cae90497b88b0ad930d3eeacfd94624586f67
2018-07-16 19:57:36 -07:00
Jun Wu
0b39ff42d9 configparser: implement basic parsing
Summary:
Handling sections and normal config items. `%` support will be added in an
upcoming patch.

Note: regex would make the code simpler - the expression
`^([^\s=]+)\s*=\s*(.*(?:\n[\t ].*)*)\s*` can extract both config name and
multi-line values. However a naive benchmark shows it is 20x slower parsing
larger files, and it has some initialization cost. Config parsing is at such
a low level and its performance is critical. So the code does its own
parsing instead of using regex.

Reviewed By: mitrandir77

Differential Revision: D8779051

fbshipit-source-id: a2de698f0676c886737c47891a0400f187bff822
2018-07-16 19:57:36 -07:00
Jun Wu
245d655673 configparser: implement loading a directory
Summary:
Add functions to load a path, where the path can either be a directory, or a
file. Implement the directory traversal. Loading a file is the most complex
part and will be implemented by an upcoming diff.

Reviewed By: lukaspiatkowski

Differential Revision: D8779052

fbshipit-source-id: f25265b4b7cc5df5cc3717643c3d0ee9cf6da8a4
2018-07-16 19:57:35 -07:00
Jun Wu
b499ec3daa configparser: add string handling utilities
Summary: They will be used by the actual parser.

Reviewed By: lukaspiatkowski

Differential Revision: D8777326

fbshipit-source-id: c6cda3168a060b1d36aaf3224a5e547d0aa45530
2018-07-11 17:36:06 -07:00
Jun Wu
3a93d55e44 configparser: implement set
Summary: This allows setting a config value.

Reviewed By: mitrandir77

Differential Revision: D8779050

fbshipit-source-id: 48544460060bcd383528461275462e63d4884f7f
2018-07-11 17:36:06 -07:00
Jun Wu
dc7ac5545a configparser: define basic interface for the config object
Summary:
Define internal objects and public API. `Bytes` is heavily used for cheaply
copying the values. Simple public APIs are implemented. Complex ones like
the actual parser will be implemented in upcoming changes.

Reviewed By: mitrandir77

Differential Revision: D8777329

fbshipit-source-id: d9de10274d7de6bcdd9af030d238b2b12594f085
2018-07-11 17:36:06 -07:00
Jun Wu
8660f02fcc configparser: define error types
Summary: Define error types to be used in upcoming changes.

Reviewed By: mitrandir77

Differential Revision: D8777328

fbshipit-source-id: 88a171c889798887e4f2436147427837b66573be
2018-07-11 17:36:06 -07:00
Jun Wu
9e08d19d8e configparser: add a new Rust library
Summary: This will be used to parse hgrc-like config files.

Reviewed By: mitrandir77

Differential Revision: D8777330

fbshipit-source-id: 73a114df36e23246a3fc1206be202fba8705453a
2018-07-11 17:36:06 -07:00
Durham Goode
51cca830f8 lz4-pyframe: fix compression of 0 length strings
Summary:
The python lz4 framing logic chooses to include no data when the input
string is 0 length. We need to match that logic in order to be compatible with
it.

See https://github.com/steeve/python-lz4/blob/master/src/python-lz4.c#L75

Reviewed By: quark-zju

Differential Revision: D8773951

fbshipit-source-id: 9bc60fc0779eb923f7c663d7e516b519963e8056
2018-07-09 18:02:58 -07:00
Durham Goode
a169a98521 loosefile: fix compilation errors in tests
Summary:
The Node::random() function changed while this was landing. So we need
to update the tests.

Reviewed By: quark-zju

Differential Revision: D8774074

fbshipit-source-id: 6f3bcdeac069ef5ffdb2deb1970a1655cabcedaf
2018-07-09 15:20:50 -07:00
Jun Wu
9e8f7613fb indexedlog: detect index corruption
Summary:
The primary log and indexes could be out of sync when mutating the indexes
error out. In that case, mark the indexes as "corrupted" and refuse to
perform index read (lookup) operations, for correctness.

Reviewed By: DurhamG

Differential Revision: D8337689

fbshipit-source-id: 3db9006ea03cfcaba52391f189aa697944b616e5
2018-07-09 14:37:27 -07:00
Jun Wu
9714887f14 indexedlog: add a test about swapping indexes
Summary:
This demonstrates the index definitions can have different orders, as long
as their names do not change, things still work.

Reviewed By: DurhamG

Differential Revision: D8337688

fbshipit-source-id: 2fbbdf711d8edc10fc6d3314532390ea712aca6c
2018-07-09 14:37:26 -07:00
Jun Wu
fdcf835ec4 indexedlog: log: add a test about index lookup
Summary: The test tries to cover interesting variants.

Reviewed By: DurhamG

Differential Revision: D8156520

fbshipit-source-id: b739d1dfcecf8bfa5b23671a83c7f314a021007b
2018-07-09 14:37:26 -07:00
Jun Wu
7a5291ee43 indexedlog: log: add LogLookupIter.into_vec
Summary: This is handy to use.

Reviewed By: DurhamG

Differential Revision: D8156517

fbshipit-source-id: 63aa836bf469de2ad55237dea02b9d0ca28fa3ce
2018-07-09 14:37:26 -07:00
Jun Wu
ee638e6de4 indexedlog: log: implement flush
Summary: Completes the interface.

Reviewed By: DurhamG

Differential Revision: D8156511

fbshipit-source-id: 0d4d05aa23c47117da70ec47cf9be3d4fe41df7b
2018-07-09 14:37:26 -07:00
Qingpeng Niu
7e0204ff39 loosefile class to read Mercurial loose file format data.
Summary: Create a simple rust reader for our loose file format.  One of Mercurial’s simplest file formats is the loose file format.  fbsource/fbcode/scm/hg/hgext/remotefilelog/remotefilelog.py:_createfileblob() is the python writing implementation.

Reviewed By: DurhamG

Differential Revision: D8731050

fbshipit-source-id: 80eb2abde2a2e5bb672d7e8ffa8ba58ed62184c1
2018-07-06 12:51:08 -07:00
Durham Goode
20c35ecbf3 revisionstore: use fixed random generator for tests
Summary:
Instead of using random nodes, let's use ones based off a seeded
generator.

Reviewed By: quark-zju

Differential Revision: D8741139

fbshipit-source-id: a90e6f092adac6aef35149ee6c4bf2b47c469602
2018-07-06 11:11:40 -07:00
Durham Goode
0e34d12531 mutabledatapack: implement get_delta_chain
Summary: Implements the get_delta_chain function of the DataStore trait.

Reviewed By: quark-zju

Differential Revision: D8598658

fbshipit-source-id: 708bca63e2da3aae6064ed18076a9a1f1282a756
2018-07-06 11:11:40 -07:00
Durham Goode
32ce9b99ab revisionstore: change delta base to be an Option<>
Summary:
Deltas may not have bases if they are a full text. Let's represent
that as an Option instead of as a magical null id value. This has the nice
effect of moving the decision to serialize a missing delta base down into the
serializer instead of up at the delta chain construction level.

Reviewed By: quark-zju

Differential Revision: D8739231

fbshipit-source-id: b58bd40dae45cb85890812db21e7eeff46aa6b4e
2018-07-06 11:11:40 -07:00
Durham Goode
28e570113e lib: remove cbincode from cargo workspace
Summary: This doesn't exist.

Reviewed By: quark-zju

Differential Revision: D8743699

fbshipit-source-id: b12c2beb600b2918bee8ca579dbf96bc8ce5288c
2018-07-05 18:50:43 -07:00
Jun Wu
a487dacc4b codemod: reformat rest of the code
Summary:
Previous code format attempt (D8173629) didn't cover all files due to `**/*.py`
was not expanded recursively by bash. That makes certain changes larger than
they should be (ex. D8675439). Now use zsh's `**/*.py` to format them.

Also fix Python syntax so black can run on more files, and all lint issues.

Reviewed By: phillco

Differential Revision: D8696912

fbshipit-source-id: 95f07aa0c5eb1b63947b0f77f534957f4ab65364
2018-07-05 17:52:43 -07:00
Jun Wu
d0c1b6d014 cargo: add a workspace
Summary:
Make `lib` a cargo workspace so building in subprojects would share a
`target` directory and `cargo doc` will build documentation for all
subprojects.

Reviewed By: DurhamG

Differential Revision: D8741175

fbshipit-source-id: 512325bcb23d51e866e764bdc76dddb22c59ef05
2018-07-05 16:06:35 -07:00
Durham Goode
a1b6fa3007 mutabledatapack: implement get_meta
Summary:
Implements the get_meta function of the DataStore trait. This caught a
bug in how we record lengths as well.

Reviewed By: quark-zju

Differential Revision: D8598661

fbshipit-source-id: 566dca1770d6666e4215fa1fd8f33babdede2f90
2018-07-05 14:53:19 -07:00
Durham Goode
b728154963 mutabledatapack: change error to contain String
Summary:
We want to be able to format error strings, so we can't return a static
str anymore.

Reviewed By: quark-zju

Differential Revision: D8598659

fbshipit-source-id: 44d7a73c06416efca51ca4d0f24a0c8911af8582
2018-07-05 14:53:19 -07:00
Durham Goode
587fc95964 mutabledatapack: begin implementing DataStore trait
Summary:
A mutabledatapack also needs to be readable as a normal store. Let's
start implementing the DataStore trait, starting with get_missing

Reviewed By: quark-zju

Differential Revision: D8598657

fbshipit-source-id: 1f8bc89fae2be73fe789bc0ef1cdd922222019a2
2018-07-05 14:53:18 -07:00
Durham Goode
875758fdbe datapack: implement getdeltachain
Summary: Implements the last of the DataStore api, getdeltachain.

Reviewed By: quark-zju

Differential Revision: D8557950

fbshipit-source-id: 7f6530fe2064f0d035414b7920a126c6aab41beb
2018-07-05 14:53:18 -07:00
Durham Goode
24a4751ff0 revisionstore: change Delta.data to Rc
Summary:
In a future diff we'll be returning data read from a pack file out as a
Delta. To avoid copies, we need to be able to return an Rc from DataPack. This
seems like it will be a common pattern, so let's go ahead and make Delta contain
its data as an Rc.

Reviewed By: quark-zju

Differential Revision: D8557949

fbshipit-source-id: 276005360bfa48e9154143dedce579a21129e976
2018-07-05 14:53:18 -07:00
Durham Goode
c6af00dbc9 datapack: implement getmetadata
Summary:
Introduces the DataEntry structure which is able to parse data entries
from pack files. Uses it to implement getmetadata

Reviewed By: quark-zju

Differential Revision: D8556610

fbshipit-source-id: c25427c3c247970a879ad7d409b821f3695b97d9
2018-07-05 14:53:17 -07:00
Durham Goode
79a9dd976f datapack: implement getmissing
Summary: Adds the DataStore trait and implements the getmissing function.

Reviewed By: quark-zju

Differential Revision: D8554391

fbshipit-source-id: 41c107c07de7d6945ca7370e264c6bc0bf154754
2018-07-05 14:53:17 -07:00
Durham Goode
ebc31e8daf datapack: add initial datapack structure
Summary:
This adds the initial struct and opener for a datapack. Future diffs
will add actual functionality and tests.

Reviewed By: quark-zju

Differential Revision: D8553436

fbshipit-source-id: 3b17f995632e859019205f242a4cce389ac77407
2018-07-05 14:53:17 -07:00
Durham Goode
029666cf27 mutabledatapack: write index to disk during serialization
Summary:
Actually write the index to disk when the mutabledatapack is
serializing.

Reviewed By: quark-zju

Differential Revision: D8552276

fbshipit-source-id: 354c7fdc3fe84b91d582f0e8cde8c6ae2494c559
2018-07-05 14:53:17 -07:00
Durham Goode
e68c0ec7e0 mutabledatapack: add logic for reading DataIndex
Summary: This adds the logic for reading a DataIndex from disk.

Reviewed By: quark-zju

Differential Revision: D8552278

fbshipit-source-id: 611ff09c27716b8d8ff7424c1a27287b9fc42b78
2018-07-05 14:53:16 -07:00
Durham Goode
d22f6ce58d mutabledatapack: add logic for writing DataIndex
Summary:
Soon we will be writing the index during pack file serialization, so
let's add the logic for serializing the index.

Reviewed By: quark-zju

Differential Revision: D8552277

fbshipit-source-id: 60829631eb060f62d266c16f6016f34080311f8e
2018-07-05 14:53:16 -07:00
Durham Goode
69a59f53d9 revisionstore: add Node::from_slice
Summary: A simple helper method for producing Node's from slices in a safe way.

Reviewed By: quark-zju

Differential Revision: D8547679

fbshipit-source-id: 85ae8fcd7749c662b1459af1d84ccf9695dd5f0b
2018-07-05 14:53:16 -07:00
Durham Goode
1c3767bc11 mutabledatapack: implement data index header serialization
Summary:
We're beginning to implement the DataPack index file logic. Let's start
with header serialization/deserialization.

Reviewed By: quark-zju

Differential Revision: D8319727

fbshipit-source-id: 079aab06ececb1c5159aec2da3243268eea0cb61
2018-07-05 14:53:15 -07:00
Durham Goode
12e0e5bf16 mutabledatapack: build inmemory index as revisions are added
Summary:
Let's build an inmemory hash table of the revisions that were added. A
future diff will serialize this index into a dataidx file.

Reviewed By: quark-zju

Differential Revision: D8309730

fbshipit-source-id: 9efc7f0f34129a63c52309b4d70179f2c10840b3
2018-07-05 14:53:15 -07:00
Durham Goode
763d1e4bef datapack: implement a fanout table
Summary:
Implements a fanout table trait that history pack and data pack will
use. It basically consists of logic to build and read a quick lookup table that
uses the first few bytes of a key to determine the bounding range of a binary
search.

Reviewed By: quark-zju

Differential Revision: D8309729

fbshipit-source-id: 71e398277dc8ae041447035f044e5d47ca41cf7e
2018-07-05 14:53:15 -07:00
Durham Goode
4588cc18c8 mutabledatapack: write version number header
Summary:
The mutabledatapack format has a one byte header containing the version
number.

Reviewed By: quark-zju

Differential Revision: D8305653

fbshipit-source-id: c4a96dc48e64acd2c5849034e5d90b87363fbc8d
2018-07-05 14:53:15 -07:00
Durham Goode
99a11bbb24 mutabledatapack: use hash of contents as name
Summary:
Implements the logic that builds a hash of the contents of the pack
file and uses it as the name.

Reviewed By: quark-zju

Differential Revision: D8305654

fbshipit-source-id: d1270e7519a7718aa5427f3be5cdc0cd0dee2fe2
2018-07-05 14:53:14 -07:00
Durham Goode
3f467bd21f mutabledatapack: implement add()
Summary:
This is the start of a rust mutable datapack implementation. The first
diff adds a simple add function. Later diffs will add the logic that builds the
index, serializes the index, and computes the final hash name.

Reviewed By: quark-zju

Differential Revision: D8304036

fbshipit-source-id: db05c2b845e51a3552c039b7fc0b8f4cc0ff0852
2018-07-05 14:53:14 -07:00
Durham Goode
8057817dc1 revisionstore: add read/write functions to Metadata
Summary:
In a future diff we'll be serializing and deserializing metadata in
datapacks. Let's add the reader and writer functions for Metadata and some unit
tests.

Reviewed By: quark-zju

Differential Revision: D8303603

fbshipit-source-id: 7e7a7aa218c05179b205abf8b151b1488be674b3
2018-07-05 14:53:14 -07:00
Liubov Dmitrieva
e22322c2e5 commit cloud subscriber: skip cloud sync if can't resolve interngraph host
Summary:
this will reduce cloud sync errors and unnecessary cloud sync calls

the daemon triggers cloud sync on service start/restart
it is not always the time when the machine online (and connected to correct network), so we get cloud sync errors

Reviewed By: markbt

Differential Revision: D8692972

fbshipit-source-id: 59033fd4c3e7c30100d82b908442bbf1ebea9322
2018-06-29 12:20:11 -07:00
Jun Wu
4ba555977c vendoredcrates: upgrade zstd-sys to the latest
Summary:
zstd has dropped `ZSTD_TARGETLENGTH_MIN` [1]. Let's upgrade our code to be
compatible.

[1]: c2c47e24e0

Reviewed By: DurhamG

Differential Revision: D8683180

fbshipit-source-id: 66cbab1ddd254b1e0b91232565b4d512810ba03d
2018-06-28 15:08:01 -07:00
Saurabh Singh
fa3c7b34a3 add basic tests for unionhistorystore
Summary:
This commit adds very basic tests for the Union History Store. These
tests just test for expected output of operations on bad/empty stores.

Reviewed By: quark-zju

Differential Revision: D8553821

fbshipit-source-id: a0dfa47f10083c37901535e8a810a99693a28c82
2018-06-27 19:05:31 -07:00
Saurabh Singh
40e70758b7 introduce union history store
Summary: This commit just introduces the Union History Store.

Reviewed By: DurhamG

Differential Revision: D8553822

fbshipit-source-id: 6c7ee0b5d33dae6d51b4179616d206f42eb0cd50
2018-06-27 19:05:31 -07:00
Saurabh Singh
0326fe4584 introduce history store
Summary: This commit just introduces the history store.

Reviewed By: DurhamG

Differential Revision: D8553823

fbshipit-source-id: 93af6059296d11c4fcc0dd306b4472c4f2168fa7
2018-06-27 19:05:31 -07:00
Saurabh Singh
164bf3e85a fix error messaging
Summary: This commit just fixes the messaging for the errors.

Reviewed By: DurhamG

Differential Revision: D8553820

fbshipit-source-id: 73f2cd13e7538b6870b16a0e47e657a6d08af9e3
2018-06-26 11:36:09 -07:00
Jun Wu
e17f635422 treestate: fix perf regression on treedirstate
Summary: `calculate_aggregated_state_recursive` should be a no-op with treedirstate.

Reviewed By: DurhamG

Differential Revision: D8505551

fbshipit-source-id: 08b081944cccc0abc4f41ac2e75c8c4305bc9772
2018-06-19 00:48:56 -07:00
Liubov Dmitrieva
91144d493c cloudsyncsubscriber log pid of cloud sync process
Summary: log the pid of the spawned cloud sync process, it might help with debugging if something is broken

Reviewed By: markbt

Differential Revision: D8478566

fbshipit-source-id: fd9a9a228bc325056fb35d17ee93c865679e6e23
2018-06-18 08:05:14 -07:00
Liubov Dmitrieva
053b496956 improve robustness
Summary:
read the token only when it is needed to do so, not in the constructor
scm daemon can run for users who are not registered with Commit Cloud

Reviewed By: markbt

Differential Revision: D8445923

fbshipit-source-id: b0d8c86729721037a02f93bbf7fa1fc88d7d7979
2018-06-15 07:48:22 -07:00
Jun Wu
807e8af1e1 zstdelta: update to rand 0.5
Summary: Update rand to 0.5. Make it build with buck.

Reviewed By: phillco

Differential Revision: D8412349

fbshipit-source-id: 663b9ca7d3c2b08ade756b4cb3f135b3af2a3d20
2018-06-14 21:38:33 -07:00
Jun Wu
119b479c9e indexedlog: log: implement index updating logic
Reviewed By: DurhamG

Differential Revision: D8156519

fbshipit-source-id: eb82e7547d10c7b839e757fa787f91950dea181e
2018-06-11 19:36:16 -07:00
Jun Wu
365c728134 indexedlog: index: add metadata to the root node
Summary:
This allows us to store arbitrary metadata in the root node. It will be used
by the `Log` structure to store how many bytes the index covers.

Reviewed By: DurhamG

Differential Revision: D8337687

fbshipit-source-id: 159a89d66765fc251a486fd62c1ffd01f625b503
2018-06-11 19:36:16 -07:00
Jun Wu
0b92632004 indexedlog: log: implement log loading functions
Summary: Implement the dependencies of the "open" public API.

Reviewed By: DurhamG

Differential Revision: D8156518

fbshipit-source-id: 9fed441f520a3b74cbef5bfb815c82943c615fdf
2018-06-11 19:36:16 -07:00
Jun Wu
77d75acbdd indexedlog: log: implement the iterators
Summary: Implement `LogLookupIter`, and `LogIter` for fetching data.

Reviewed By: DurhamG

Differential Revision: D8156521

fbshipit-source-id: 5ef2b2e6475d41ae7468e79b4a1234619decf75f
2018-06-11 19:36:15 -07:00
Jun Wu
8c3a69a56e indexedlog: log: implement internal read_entry function
Summary:
The read_entry function takes care of reading an entry from a given offset,
and return internal stats like real data offset (skipping the length and
checksum metadata), and the next entry offset.

It does integrity check and handles offset for both in-memory and on-disk
buffers. The offsets to in-memory entries are fairly simple - they start
from "meta.primary_len" instead of a fixed reserved value. This makes the
"next_offset" work seamlessly.

The public API won't have "offset" exposed, so the API is private.

Reviewed By: DurhamG

Differential Revision: D8156513

fbshipit-source-id: 8661f2f2757de6f3f94defc64f4a8dd5261973b2
2018-06-11 19:36:15 -07:00
Jun Wu
991a9343b9 indexedlog: log: partially implement main APIs
Summary:
Partially implement open, append, flush, lookup APIs. This shows how things
work in general, like how locking works. What's in-memory and what's on-disk
etc.

Reviewed By: DurhamG

Differential Revision: D8156514

fbshipit-source-id: 2de23dcde2f63895f3f3e4f67057aa9520fdfa34
2018-06-11 19:36:15 -07:00
Jun Wu
529c79bd33 indexedlog: log: implement serialization for the meta file
Summary: Implemented as the file format specification added by the previous diff.

Reviewed By: DurhamG

Differential Revision: D8156516

fbshipit-source-id: 7153932b9442b3ab5bdb81490f88c40346128afc
2018-06-11 19:36:15 -07:00
Jun Wu
97281caabf indexedlog: log: define public facing interface
Summary: The public interface and its dependencies.

Reviewed By: DurhamG

Differential Revision: D8156509

fbshipit-source-id: c6f3e4b88851683a5d8804b80f689282e3f582d4
2018-06-11 19:36:15 -07:00
Jun Wu
8ad9276975 indexedlog: log: add comments about the file format
Summary: Start implementing the "Log" object. Let's define the file formats first.

Reviewed By: DurhamG

Differential Revision: D8156515

fbshipit-source-id: 037f7454452959f82583a4d97d3f38dfa60aa741
2018-06-11 19:36:14 -07:00
Jun Wu
d7c4d3a249 treestate: optimize calculate_aggregated_state_recursive
Summary:
Follow-up of the previous diff. Change the file format so aggregated_state
could be loaded without loading all entries. This would make
`calculate_aggregated_state_recursive` (and `write_delta`) more efficient
in case the node is not modified.

Reviewed By: markbt

Differential Revision: D7909169

fbshipit-source-id: d70b662c7d8c544edf81fbc7da94da9ccbee6cf0
2018-06-11 14:32:42 -07:00
Jun Wu
4819f4203d treestate: calculate aggregated_state recursively during write_delta
Summary:
This avoids a possible Rust panic during `write_delta`, because entries
could have `id` set without `aggregated_state`, by `Node::open`. This diff
fixes that by calling `calculate_aggregated_state_recursive`. The function
has to be changed to static dispatch, since dynamic dispatch only supports
one trait. Practically, this would load one-level content unnecessarily,
which might be optimized by separating loading entries vs loading aggregated
state.

Reviewed By: markbt

Differential Revision: D7909168

fbshipit-source-id: 5effe9df59ce42829a077cab89525103e211bddf
2018-06-11 14:32:42 -07:00
Jun Wu
1f83a4dc00 treestate: require visitor to provide whether it modifies a file or not
Summary:
This is subtle. If visitor changes file state, `Node.id` should be set to
`None` to mark it as "changed".

In practise, treedirstate uses visitor to rewrite mtime to -1 if mtime is
"fsnow". Those rewritten mtime all belong to "changed" nodes (because "fsnow"
can only increase, and on-disk entries cannot have "mtime == fsnow" because
they would be written to -1 during the previous write), so it's not a problem
yet.

It is safer to not depend on the fact that "visitor" can only change "changed"
nodes. On the other hand, detecting changes for all filestate fields could be
undesirably expensive. So let's make the visitor provide the "changed or not"
information. Surely the visitor knows what it does.

Reviewed By: markbt

Differential Revision: D7909167

fbshipit-source-id: 21e71302cf1db86c1330b294baddd51cc8a96026
2018-06-11 14:32:42 -07:00
Jun Wu
98db645ecd treestate: drop Tree.get_mut API
Summary: It's not used, therefore removed.

Reviewed By: markbt

Differential Revision: D7909171

fbshipit-source-id: 587a1d844ece4f2cb0c2ccd9b2d978aed69a959f
2018-06-11 14:32:42 -07:00
Liubov Dmitrieva
a4d1fac35a commitcloud add '--use-bgssh' option for hg cloud sync
Summary:
this is needed because `hg cloud sync` can be triggered by external serviced like scm_daemon on behalf of the user,
so it should just fail rather than expect user to type the password, so we change ui ssh option to the bgssh (background ssh) that is defined in infinitepush section

Reviewed By: markbt

Differential Revision: D8331723

fbshipit-source-id: 28f9d007702e4f6ed5216114921375b76def3f93
2018-06-08 10:32:34 -07:00
Jun Wu
c5b267584b treestate: migrate to rand 0.5 to fix cargo test without breaking buck
Summary:
The pull request [1] is still open, which means `quickcheck::rand` is still
private when building with `cargo`. It only works with a patched quickcheck.
We cannot revert D8234503 since that will break buck build. So there is no
choice but upgrade to rand 0.5.

[1]: https://github.com/BurntSushi/quickcheck/pull/204

Reviewed By: DurhamG

Differential Revision: D8297404

fbshipit-source-id: 19937c49ae96a39e326b1b54eb00e6e2944193c2
2018-06-06 12:54:37 -07:00
Phil Cohen
b54cbaa464 commitcloudsubscriber: use old import syntax
Summary: The Ubuntu and Windows builders have an older rustc that doesn't support this syntax.

Reviewed By: DurhamG

Differential Revision: D8301570

fbshipit-source-id: 56990a804053a4dc78e41789c7b577bcf82868d7
2018-06-06 12:06:18 -07:00
Wez Furlong
31bcfbe58e hg: disable check-code tests for C code
Summary:
They're actively fighting against the clang-format config
and don't have an auto-fix.

Reviewed By: quark-zju

Differential Revision: D8283622

fbshipit-source-id: 2de45f50e6370a5ed14915c6ff23dc843ff14e8a
2018-06-05 19:21:43 -07:00
Durham Goode
d34a99a394 commitcloud: avoid using nested includes
Summary:
The windows and ubuntu builds don't have a version of rust that
supports these features, so this breaks the build.

Reviewed By: phillco, quark-zju, singhsrb

Differential Revision: D8289651

fbshipit-source-id: d08b141b4d9996e3b899ac0604225ad34f863990
2018-06-05 16:06:56 -07:00
Liubov Dmitrieva
c80a2aafcb scm daemon: refactoring (remove unused crates)
Summary: just refactoring to improve the code quality

Reviewed By: markbt

Differential Revision: D8276584

fbshipit-source-id: bf0317e91f96d2f7fee24ea69c0f33a0aed54a98
2018-06-05 07:11:55 -07:00
Liubov Dmitrieva
ae7ece9cd5 scm daemon: refactoring
Summary:
just refactoring to improve the code quality

the main improvement is that I separated TcpReceiver to a different service,
any other services can register callbacks with TcpReceiver service.

For WorkspaceSubscriberService callbacks are implemented using mpsc channel to notify the main WorkspaceSubscriberService thread and single atomic flag that allows running subscriptions to join.

Another improvement is that I added logic to run cloud sync on the first keep alive after connection errors

Reviewed By: markbt

Differential Revision: D8226109

fbshipit-source-id: 3fe513da9273b28b2262948ecdf620821e7ab313
2018-06-05 07:11:55 -07:00
Liubov Dmitrieva
80f63e9451 scm daemon: refactoring (improve messages correctness)
Summary: just refactoring to improve the code quality

Reviewed By: markbt

Differential Revision: D8276563

fbshipit-source-id: afca70b9b487450fbaab897dff5cd79d6c3a0108
2018-06-05 04:36:07 -07:00
Jun Wu
c65612acc9 indexedlog: index: stop iteration if an error is encountered
Summary:
Without this change, code doing `index.get(...).values().collect()` might
end up with an infinite loop.

Reviewed By: DurhamG

Differential Revision: D8156510

fbshipit-source-id: 5497aa354de7d49cfc4308a025856608ce981a1e
2018-06-05 00:12:29 -07:00
Jun Wu
798e55d53d indexedlog: index: change APIs to take file lengths instead of root offsets
Summary:
Previously, the index API optionally takes a root offset. This is
inconvenient for the caller since they probably need to record both
valid file length and root offsets. Since root nodes are always at
the end of the index. Let's just simplify the API to take a logical
file length instead of a root offset.

Reviewed By: DurhamG

Differential Revision: D8156512

fbshipit-source-id: 7029272a61c9990e6484bca7ebbff64e2233c6cd
2018-06-05 00:12:29 -07:00
Jun Wu
68660cc443 indexedlog: utils: make mmap_readonly optionally take file length
Summary:
Previously, `mmap_readonly` always reads file length, and uses that for mmap
length. In many cases we do know the desired file length and it's cleaner to
not `mmap` unused bytes. So let's add a parameter to do that.

Note: The `stat` call is still needed. Since `mmap` wouldn't return an error
of the requested length is greater than the file length.

Reviewed By: DurhamG

Differential Revision: D8156523

fbshipit-source-id: 991aa28f3542eaff24387dcc6a7302122fb6962f
2018-06-05 00:12:29 -07:00
Jun Wu
c43312ad9c indexedlog: utils: move xxhash to utils
Summary: The function will be reused in another module.

Reviewed By: DurhamG

Differential Revision: D8156522

fbshipit-source-id: 2aff6f2e4b8fc9b5d2c000e12ac2d940f7fab407
2018-06-05 00:12:29 -07:00
Saurabh Singh
7c9227818a refactor rust datastore to a consistent naming scene
Summary: This is just a refactor to address the naming scheme.

Reviewed By: quark-zju

Differential Revision: D8269217

fbshipit-source-id: 8c52d2c67837550e0b7dc1a45b3faf9a80319b61
2018-06-04 17:39:47 -07:00
Saurabh Singh
7067e4ca1f fix nit in implementation
Summary:
Based on review for D8214151 by quark-zju, addressing the nit here as
well.

Reviewed By: quark-zju

Differential Revision: D8267140

fbshipit-source-id: 12c3355852a49859c2b0a243fa8666105c914c73
2018-06-04 16:21:38 -07:00
Saurabh Singh
8ba5a79489 adding tests for bad data store
Summary:
Adding the tests for the case when the union store has only one data
store which always returns an `Err` as `Result`. This `Err` is not of the type
`KeyError` which the union store handles differently.

Reviewed By: quark-zju

Differential Revision: D8214156

fbshipit-source-id: bd077af343086c92f46ec6a6f1551d05dd9bda09
2018-06-04 16:21:38 -07:00
Saurabh Singh
242f2b904f add tests for empty data store
Summary:
Adding tests for the case when the union store only has a single data
store which is completely empty.

Reviewed By: quark-zju

Differential Revision: D8214151

fbshipit-source-id: 9d8f329548a1b7e105a5dc6219067a6e292fe97c
2018-06-04 16:21:37 -07:00