Commit Graph

691 Commits

Author SHA1 Message Date
Xavier Deguillard
dc589c38e5 revisionstore: don't import unused format_err
Summary: The compiler is warning about it.

Reviewed By: singhsrb

Differential Revision: D21550266

fbshipit-source-id: 4e66b0dda0e443ed63aeccd888d38a8fcb5e4066
2020-05-13 10:44:07 -07:00
Jun Wu
4f39a8e5a6 mutationstore: add a method that returns a dag
Summary:
Part of the mutation graph (excluding split and fold) can fit in the DAG
abstraction. Add a method to do that. This allows cross-dag calculations
like:

  changelogdag = ... # suppose available by segmented changelog

  # mutdag and changelogdag are independent (might have different nodes),
  # with full DAG operations on either of them.
  mutdag = mutation.getdag(...)
  mutdag.heads(mutdag.descendants([node])) & changelogdag.descendants([node2]) # now possible

Comparing to the current situation, this has some advantages:
- No need to couple the "visibility", "filtered node" logic to the mutation
  layer. The unknown nodes can be filtered out naturally by a set "&"
  operation.
- DAG operations like heads, roots can be performed on mutdag when it's
  previously impossible. We also get operations like visualization for free.

There are some limitations, though:
- The DAG cannot represent non 1:1 modifications (fold, split) losslessly.
  Those relationships are simply ignored for now.
- The MemNameDag is not lazy. Reading a long chain of amends might be slow.
  For most normal use-cases it is probably okay. If it becomes an issue we
  can seek for other solutions, for example, store part of mutationstore
  directly in a DAG format on disk, or have fast paths to bypass long
  predecessor chain calculation.

Reviewed By: DurhamG

Differential Revision: D21486521

fbshipit-source-id: 03624c8e9803eb1852b3034b8f245555ec582e85
2020-05-13 09:45:24 -07:00
Arun Kulshreshtha
647a91647b edenapi: add history support to data_util
Summary: Add the ability to parse EdenAPI history responses to `data_util`.

Reviewed By: quark-zju

Differential Revision: D21489228

fbshipit-source-id: 42dda64273673431a6f3e4d7bd430689c76c387f
2020-05-12 16:26:22 -07:00
Arun Kulshreshtha
40928f027c make_req: take array instead of object as input for data requests
Summary: Change `make_req` to take a JSON array as input when constructing `DataRequest`s instead of a JSON object. This is more correct because DataRequests can include multiple `Key`s with the same path; this cannot be represented as an object since an object is effectively a hash map wherein we would have duplicate keys.

Reviewed By: quark-zju

Differential Revision: D21412989

fbshipit-source-id: 07a092a15372d86f3198bea2aa07b973b1a8449d
2020-05-12 16:26:20 -07:00
Ellis Hoag
1d0d626a36 Pass config object down to repack
Summary:
Pass `configparser::config::ConfigSet` to `repack` in
`revisionstore/src/repack.rs` so that we can use various config values in `filter_incrementalpacks`.

* `repack.maxdatapacksize`, `repack.maxhistpacksize`
  * The overall max pack size
* `repack.sizelimit`
  * The size limit for any individual pack
* `repack.maxpacks`
  * The maximum number of packs we want to have after repack (overrides sizelimit)

Reviewed By: xavierd

Differential Revision: D21484836

fbshipit-source-id: 0407d50dfd69f23694fb736e729819b7285f480f
2020-05-11 16:41:30 -07:00
Xavier Deguillard
1cd0bba3fa revisionstore: enable use of proxies for LFS
Summary:
If http_proxy.no is set, we should respect it to avoid sending traffic to it
whenever required.

Reviewed By: wez

Differential Revision: D21383138

fbshipit-source-id: 4c8286aaaf51cbe19402bcf8e4ed03e0d167228b
2020-05-11 10:36:11 -07:00
Xavier Deguillard
2001c3fd69 revisionstore: add translate_lfs_missing to remote store get
Summary:
When Qing implemented all the get method, the translate_lfs_missing function
didn't exist, and I forgot to add them in the right places when landing the
diff that added it. Fix this.

Reviewed By: sfilipco

Differential Revision: D21418043

fbshipit-source-id: baf67b0fe60ed20aeb2c1acd50a209d04dc91c5e
2020-05-11 10:34:01 -07:00
Jun Wu
85a60dd9e4 renderdag: provide a method to render MemNameDag directly to a string
Summary: This would be handy to visualize a MemNameDag.

Reviewed By: sfilipco

Differential Revision: D21486522

fbshipit-source-id: c8d7147dc53a1a7c1b8b09ce055493c69cceba2f
2020-05-11 09:50:00 -07:00
Jun Wu
4352be72d3 renderdag: use MemNameDag to simplify tests
Summary:
Use MemNameDag::from_ascii to simplify the tests. This removes the need of:
- using tempdir
- converting between Id and VertexName manually via an IdMap
- depending on drawdag directly

Reviewed By: sfilipco

Differential Revision: D21486519

fbshipit-source-id: f04061d8892f043de40e7e321273acc51e15308a
2020-05-11 09:50:00 -07:00
Jun Wu
60684eb2c5 dag: make ASCII -> MemNameDag a public API
Summary:
It seems handy to construct a Dag just from ASCII. Therefore move it to a
public interface.

Reviewed By: sfilipco

Differential Revision: D21486525

fbshipit-source-id: de7f4b8dfcbcc486798928d4334c655431373276
2020-05-11 09:49:59 -07:00
Jun Wu
a6b7e965f3 dag: remove a TODO comment
Summary: It was done as NameSet.

Reviewed By: sfilipco

Differential Revision: D21479022

fbshipit-source-id: 1c32cabb27d72a6438409ede226104a9ebac6a1d
2020-05-11 09:49:59 -07:00
Jun Wu
4eb9251172 dag: move sort and parent_names to NameDagAlgorithm
Summary:
They are part of the read-only algorithms that are not specific to a certain
type of NameDag.

Reviewed By: sfilipco

Differential Revision: D21479017

fbshipit-source-id: 3fa58071ac43246d3cd45d84384ee93c7385f414
2020-05-11 09:49:59 -07:00
Jun Wu
282e034d30 dag: add MemNameDag
Summary:
Adds an in-memory NameDag so we can construct the DAG and use its algorithms by
just providing parents function and heads.

Reviewed By: sfilipco

Differential Revision: D21479021

fbshipit-source-id: e12d53a97afec77b2307d5efbb280bd506dee0ba
2020-05-11 09:49:58 -07:00
Jun Wu
5cbb99f4eb dag: add MemIdMap
Summary: Adds an in-memory IdMap to be used in an in-memory NameDag.

Reviewed By: sfilipco

Differential Revision: D21479018

fbshipit-source-id: bc702762b059e8659c6ab322f3c39f032e95d5b6
2020-05-11 09:49:58 -07:00
Jun Wu
682e8e96a7 dag: use IdMap traits in NameDag and NameSet
Summary:
This allows them to switch to a different IdMap implementation relatively
easily.

Reviewed By: sfilipco

Differential Revision: D21479023

fbshipit-source-id: 8ecb99cafe2093ec7d14b848ffa08581c5300414
2020-05-11 09:49:57 -07:00
Jun Wu
759f8b35c5 dag: move some IdMap operations to traits
Summary: This will allow different IdMap implementations.

Reviewed By: sfilipco

Differential Revision: D21479016

fbshipit-source-id: 852501896fddcb82624338acd9dceee41150e302
2020-05-11 09:49:57 -07:00
Jun Wu
30163eeb58 dag: update snapshot_map on change
Summary:
`NameDag::add_heads` API changes the internal `dag` state without updating
`snapshot_map`. That will cause queries relying on `snapshot_map` to fail.
Update it so that `snapshot_map` gets updated by `add_heads`.

Reviewed By: sfilipco

Differential Revision: D21479019

fbshipit-source-id: 70528aa4a488cef3dc71bf21dd89e45cfe763794
2020-05-11 09:49:57 -07:00
Jun Wu
f014f86b7a dag: move NameDag algorithms to a trait
Summary:
This makes it easier to add an "in-memory-only" NameDag with all the algorithms
implemented.

Reviewed By: sfilipco

Differential Revision: D21479020

fbshipit-source-id: c1a73e95f3291c273c800650f70db2a7eb0966d7
2020-05-11 09:49:56 -07:00
Xavier Deguillard
4e7303efd9 lfs: only upload when LFS blobs are present
Summary: If no LFS blobs needs uploading, then don't try to connect to the LFS server in the first place.

Reviewed By: DurhamG

Differential Revision: D21478243

fbshipit-source-id: 81fa960d899b14f47aadf2fc90485747889041e1
2020-05-08 13:24:21 -07:00
Meyer Jacobs
d49ac73f4c datastore: remove HgIdDataStore ::get_delta and ::get_delta_chain
Summary:
Remove HgIdDataStore::get_delta and all implementations. Remove HgIdDataStore::get_delta_chain from trait, remove all unnecessary implentations, remove all implementations from public Rust API. Leave Python API and introduce "delta-wrapping".

MutableDataPack::get_delta_chain must remain in some form, as it necessary to implement get using a sequence of Deltas. It has been moved to a private inherent impl.

DataPack::get_delta_chain must remain in some form for the same reasons, and in fact both implenetations can probably be merged, but it is also used in repack.rs for the free function repack_datapack. There are a few ways to address this without making DataPack::get_delta_chain part of the public API. I've currently chosen to make the method pub(crate), ie visible only within the revisionstore crate. Alternatively, we could move the repack_datapack function to a method on DataPack, or use a trait in a private module, or some other technique to restrict visibility to only where necessary.

UnionDataStore::get has been modified to call get on it's sub-stores and return the first which matches the given key.

MultiplexDeltaStore has been modified to implement get similarly to UnionDataStore.

Reviewed By: xavierd

Differential Revision: D21356420

fbshipit-source-id: d04e18a0781374a138395d1c21c3687897223d15
2020-05-07 11:04:01 -07:00
Mark Thomas
052e7c3877 check-code: convert to Python 3
Summary:
Update `contrib/check-code.py` to Python 3.

Mostly it was already compatible, however stricter regular expression parsing
revealed a case where one of our tests wasn't working, and as a result lots of
instances of `open(file).read()` existed that this test should have caught.

I have fixed up most of the instances in the code, although there are many
in the test suite that I have ignored for now.

Reviewed By: quark-zju

Differential Revision: D21427212

fbshipit-source-id: 7461a7c391e0ade947f779a2b476ca937fd24a8d
2020-05-07 09:07:50 -07:00
Durham Goode
939ff6c956 configs: move repo names to a enum
Summary:
A number of repo names are used quite frequently. Let's use an enum to
prevent typos and make things cleaner.

Reviewed By: quark-zju

Differential Revision: D21365036

fbshipit-source-id: 1d3d681443df181e9076f5ee87029ae61124a486
2020-05-06 09:03:17 -07:00
Mohan Zhang
2ef3e20e4d Run auto cargo locally
Summary: Since thirdpart depedency change on D21341319

Reviewed By: jsgf, wqfish

Differential Revision: D21417890

fbshipit-source-id: 3cc6bafa23512c7ae489513216bcafa46e7a744f
2020-05-05 20:59:02 -07:00
Zeyi (Rice) Fan
952069397c fix get blob local
Summary: This bug got in while iterating the original Diff. It should only be returning empty when the blob does not exist locally.

Reviewed By: xavierd

Differential Revision: D21417659

fbshipit-source-id: 676e22313ab4a024af5341d8c99797fc062bd293
2020-05-05 20:21:21 -07:00
Durham Goode
97d84e3b5d configs: move hgrc.dynamic to always be in the shared repo
Summary:
Instead of trying to maintain two hgrc.dynamic's for shared repositories,
let's just always use the one in the shared repo. In the long term we may be
able to get rid of the working-copy-specific hgrc entirely.

This does remove the ability to dynamically configure individual working copies.
That could be useful in cases where we have both eden and non-eden pointed at
the same repository, but I don't think we rely on this at the moment.

Reviewed By: quark-zju

Differential Revision: D21333564

fbshipit-source-id: c1fb86af183ec6dc5d973cf45d71419bda5514fb
2020-05-05 18:19:10 -07:00
Durham Goode
6e7f85b949 config: load .hg/hgrc.dynamic
Summary:
Adds .hg/hgrc.dynamic to the default load path, before .hg/hgrc though,
so it can be override.

Reviewed By: quark-zju

Differential Revision: D21310921

fbshipit-source-id: 288a2a2ba671943a9f8532489c29e819f9d891e1
2020-05-05 18:19:08 -07:00
Durham Goode
dc90e2ca04 configs: add domain, platform, and improve dynamicconfigs
Summary:
Adds dynamic config conditionals for domain, platform, and adds a few
useful helper functions.

Reviewed By: quark-zju

Differential Revision: D21310920

fbshipit-source-id: 58f35e52d6d7a4edae2b3aefff533ef2c021aa57
2020-05-05 18:19:08 -07:00
Durham Goode
8744b81151 git: update git dependency to 0.13.5 to match internal version
Summary:
Our internal git dependency got upgraded, so we need to upgrade our
Cargo.toml version.  Unfortunately this doesn't seem to have any test coverage?

Reviewed By: singhsrb

Differential Revision: D21410241

fbshipit-source-id: 64fe7f39a9c93aa5d97ce095ee1641c1cc6ed365
2020-05-05 15:35:12 -07:00
Zeyi (Rice) Fan
3baa8cc9b4 check if the blob fetching is present locally
Summary:
Talked with xavierd last week and we can use LocalStore's `get_missing` to determine if a blob is present locally. In this way we can prevent the backingstore crate from accidentally asking EdenAPI for a blob, so better control at EdenFS level.

With this change, we can use this function at the time where a blob import request is created with confidence that this should be short cheap call.

This diff should not change any behavior or performance.

Reviewed By: xavierd

Differential Revision: D21391959

fbshipit-source-id: fd31687da1e048262cb4eae2974cab6d8915a76d
2020-05-05 11:14:40 -07:00
Zeyi (Rice) Fan
da28f5a5b1 revisionstore: create directory with group share permission in correct places
Summary: When we create directory at certain places, we want these directories to be shared between different users on the same machine. This Diff uses the previously added `create_shared_dir` function to create these directories.

Reviewed By: xavierd

Differential Revision: D21322776

fbshipit-source-id: 5af01d0fc79c8d2bc5f946c105d74935ff92daf2
2020-05-04 19:21:33 -07:00
Jun Wu
73ff6559e6 zstore: add simple caching
Summary: Add simple caching so zstore can avoid some zstd calculation.

Reviewed By: DurhamG

Differential Revision: D21213076

fbshipit-source-id: 5e3152949cf4e6d6193c3ef3401f24e2efac5620
2020-05-01 14:24:52 -07:00
Durham Goode
3ac2be361a configs: move fbrules to a Facebook only part of the crate
Summary:
We'll be adding a bunch of Facebook specific configuration and values
here. Let's move it to someplace not open source.

Reviewed By: quark-zju

Differential Revision: D21241038

fbshipit-source-id: 2ac9cdce40b1b46f15f171d9d1f6b6692dcd29bf
2020-05-01 13:17:21 -07:00
Durham Goode
b1a2785a19 configparser: add ensure_location_supersets function
Summary:
Implements an ensure_location_supersets function who's goal is to
verify that a given config location specifies the exact same configs as a given
set of other locations. Any inconsistencies are removed from the config and
reported to the caller.

This will be used to ensure our dynamic configs match our existing rc file
configs exactly, before we delete the file configs.

Reviewed By: quark-zju

Differential Revision: D21240837

fbshipit-source-id: e2c8ec054a3696d2cf02e65c212ad886c5117253
2020-05-01 13:17:21 -07:00
Jason White
d5b2fb798e Fix outdated Cargo.toml files
Summary: `cargo autocargo` should normally produce no changes on `master`. The features of the `log` crate was updated in D21303891 without re-running autocargo. This fixes it.

Reviewed By: dtolnay

Differential Revision: D21349799

fbshipit-source-id: ce487bc5989e179673297350249593103b4d34dd
2020-05-01 10:29:33 -07:00
Zeyi (Rice) Fan
8830ed55df util: not try to create the directory when it already exists
Summary: Fix permission issues we are seeing with the latest Mercurial release.

Reviewed By: xavierd

Differential Revision: D21294499

fbshipit-source-id: bcfb13dd005258b2e3b74fa281dbd8df36133ef6
2020-04-28 20:33:59 -07:00
Carolyn Busch
4eeab3b81b Update cpython to 0.5
Summary:
D21270958 updated the cpython, python27-sys, and python3-sys crates to 0.5. Update
the Mercurial cargo dependencies to match.

Reviewed By: xavierd

Differential Revision: D21281875

fbshipit-source-id: ccad68749a25d11240351b5faeef27cb9c693456
2020-04-28 11:47:41 -07:00
Jun Wu
d479053954 metalog: support exporting to a git repo
Summary:
I wanted to figure out "who added this visible head", "what is the difference
between this metalog root and that root". Those are actually source control
operations (blame, diff). Add a git export feature so we can export metalog
to git to run those queries.

Choosing git here as we don't have native Rust utilities to create a more
efficient hg repo yet.

Ideally we can also make hg operate on a metalog directory as a "metalogrepo"
directly. However that seems to be quite difficult right now due to poor
abstractions.

Reviewed By: DurhamG

Differential Revision: D21213073

fbshipit-source-id: 4cc0331fbad6e1586907c0a66c18bcc25608ea49
2020-04-27 20:25:25 -07:00
Jun Wu
a0207c4542 metalog: expose root id API
Summary: This allows the Python world to obtain the root ID for logging purpose.

Reviewed By: DurhamG

Differential Revision: D21179513

fbshipit-source-id: 3f289c06d3d470ff492de39fa985203b3facbf00
2020-04-27 19:50:58 -07:00
Jun Wu
d8bdae0449 indexedlog: remove chown feature
Summary:
We removed the feature in D20704618 and it does not cause complaints.
Let's remove the code supporting the chown feature.

Reviewed By: DurhamG

Differential Revision: D21170307

fbshipit-source-id: c845016219e8c681930bb1780b94e6d31ca99730
2020-04-27 15:47:59 -07:00
Xavier Deguillard
86965b2f80 revisionstore: query store before fetching
Summary:
While the change looks fairly mechanical and simple, the why is a bit tricky.
If we follow the calls of `ContentStore::get`, we can see that it first goes
through every on-disk stores, and then switches to the remote ones, thanks to
that, when we reach the remote stores there is no reason to believe that the
local store attached to them contains the data we're fetching. Thus the
code used to always prefetch the data, before reading from the store what was
just written.

While this is true for regular stores (packstore, indexedlog, etc), it starts
to break down for the LFS store. The reason being that the LFS store is
internally represented as 2 halves: a pointer store, and a blob store.  It is
entirely possible that the LFS store contains a pointer, but not the actual
blob. In that case, the `get` executed on the LFS store will simply return
`Ok(None)` as the blob just isn't present, which will cause us to fallback to
the remote stores. Since we do have the pointer locally, we shouldn't try to
refetch it from the remote store, and thus why a `get_missing` needs to be run
before fetching from the remote store.

As I was writing this, I realized that all of this subtle behavior is basically
the same between all the stores, but unfortunately, doing a:
  impl<T: RemoteDataStore + ?Sized> HgIdDataStore for T
Conflicts with the one for `Deref<Target=HgIdDataStore>`. Macros could be used
to avoid code duplication, but for now let's not stray into them.

Reviewed By: DurhamG

Differential Revision: D21132667

fbshipit-source-id: 67a2544c36c2979dbac70dac5c1d055845509746
2020-04-27 12:53:11 -07:00
Qing Dong
b4edb7ff7f revisionstore: implement the get() functions on the various LocalDataStore interface
Summary: implement the get() functions on the various LocalDataStore interface implementations

Reviewed By: quark-zju

Differential Revision: D21220723

fbshipit-source-id: d69e805c40fb47db6970934e53a7cc8ac057b62b
2020-04-27 12:35:24 -07:00
Xavier Deguillard
39d49b694a revisionstore: remove memcache dependency on @mode/mac
Summary:
Memcache isn't available for Mac, but we can build the revisionstore with Buck
on macOS when building EdenFS. Let's only use Memcache for fbcode builds on
Linux for now.

Reviewed By: chadaustin

Differential Revision: D21235247

fbshipit-source-id: 5943ad84f6442e4dabbd2a44ae105457f5bb9d21
2020-04-24 17:29:36 -07:00
Zeyi (Rice) Fan
3374b99f28 util: ensure correct permission is set when creating directories
Summary:
When creates directories sometime we want to make sure other users within the same group have the write access to it to enable data sharing. Previously we rely on setting umask for the entire process to make sure the newly created directories have the correct permission bit. This is kind fragile and error-prone when running in a multi-thread environment.

This diff introduces an internal function `create_dir_with_mode` to create directory with specified permission mode. It first creates a temporary directory within the parent of the directory being created, setting up the correct permission bit, then attempts to rename the temporary directory to the desired name. This ensures that we never leave a directory without the correct permission in the place we need and without changing umask for the process.

Reviewed By: xavierd

Differential Revision: D21188903

fbshipit-source-id: 381bff7d3aaca097b9d50150e86cbbf70a90a0a5
2020-04-24 17:17:05 -07:00
Durham Goode
092d350800 filesystem: add treestate walking logic
Summary:
The second phase of pending changes is to iterate over the treestate
and figure out what files were not seen in the filesystem walk. This diff
implements that.

Reviewed By: xavierd

Differential Revision: D20546899

fbshipit-source-id: 3523fbc7e31ef0ed09c4937c72264b64e2a3db5b
2020-04-24 13:58:53 -07:00
Durham Goode
73a45b695b filesystem: add filesystem walking to PendingChanges
Summary:
The first phase of pending changes is inspecting the filesystem for
changes. This diff adds that logic.

Reviewed By: xavierd

Differential Revision: D20546909

fbshipit-source-id: 1c2c0fa7f700dbff4acfce4d5271b4472a13571f
2020-04-24 13:58:53 -07:00
Xavier Deguillard
19bfd35298 revisionstore: multiplex stores should return a path on flush
Summary:
On repack, when the Rust stores are in use, the repack code relies on
ContentStore::commit_pending to return the path of a newly created packfile, so
it won't delete it when going over the repacked ones. When LFS is enabled, both
the shared and the local stores are behind the LfsMultiplexer store that
unfortunately would always return `Ok(None)`. In this situation, the repack
code would delete all the repacked packfiles, which usually is the expected
behvior, unless only one packfile is being repacked, in which case the repack
code effectively re-creates the same packfile, and is then subsequently
deleted.

The solution is for the multiplex stores to properly return a path if one was
returned from the underlying stores.

Reviewed By: DurhamG

Differential Revision: D21211981

fbshipit-source-id: 74e4b9e5d2f5d9409ce732935552a02bdde85b93
2020-04-23 15:14:28 -07:00
Arun Kulshreshtha
6d3cacf9fd edenapi: add utility programs
Summary:
Add two utility programs for ad-hoc debugging of EdenAPI. EdenAPI requests and responses are encoded as CBOR, which is not easy to work with manually on the command line. In order to allow debugging the HTTP API using tools like `curl`, we need tools that can generate raw request payloads and interpret CBOR responses.

The utility programs included in this diff are:

- `make_req` - Can construct EdenAPI request payloads from a human-editable JSON representation.
- `data_util` - Can list, validate, and extract the contents of an EdenAPI data response.

These tools can be used by themselves or as part of a pipeline. See test plan for examples.

Reviewed By: xavierd

Differential Revision: D21136575

fbshipit-source-id: d1ac8d92964614005078a6ac76dd0835c29a80a5
2020-04-23 11:43:51 -07:00
Mark Thomas
c05efd8a5c mutationstore: move MutationEntry type to types crate
Summary: Move the MutationEntry type to the Mercurial types crate.  This will allow us to use it from Mononoke.

Reviewed By: quark-zju

Differential Revision: D20871338

fbshipit-source-id: 8de3bb8a2673673bc4c8a6cc7578a0a76358c14a
2020-04-23 08:58:10 -07:00
Durham Goode
dbff6c6b9a filesystem: add initial PendingChanges stubs
Summary:
The part of status that lists what files have changed is called
PendingChanges. This diff introduces the initial stub for PendingChanges. The
pending changes algorithm involves three parts:

1. Looking at files on the filesystem for changes.
2. Looking at files in the dirstate map for changes.
3. Looking at the content for any files that we were unsure of during steps 1
and 2.

This diff puts the basic state machine in place, and accepts the basic
information about the working copy (the root and what type of filesystem it is).
In the future we might have it detect what type of filesystem it is, but for now
this makes it easy.

Reviewed By: xavierd

Differential Revision: D20546898

fbshipit-source-id: a3030b7c846b3cb2fcba805b7fe4744df7c5764e
2020-04-22 19:55:50 -07:00
Durham Goode
701273d08f treestate: trim separators off get_filtered_keys inputs and outputs
Summary:
treestate.get_filtered_keys passes directory paths to the filter
function and returns directory matches with a trailing '/' on the end. This
makes it difficult to act as a path normalization function when the caller
doesn't know if the path is a file or directory.

It seems like we can just strip the trailing '/' before exposing the strings to
the caller (both as filter inputs and as get_filtered_keys outputs).

This is useful in the following diff that adds a case normalization crate.

Reviewed By: xavierd

Differential Revision: D20880881

fbshipit-source-id: 6e9f419178b4e278844244bd6aff2fc10e09d2cd
2020-04-22 19:55:50 -07:00
Durham Goode
e97d8d8895 vfs: move vfs logic into its own crate
Summary:
This logic will be used in a variety of places (update workers, status,
etc). Let's move it somewhere common.

Reviewed By: xavierd

Differential Revision: D20771623

fbshipit-source-id: b4de7c1d20055a10bbc1143d44c55ea1045ec62a
2020-04-22 19:55:49 -07:00
Durham Goode
a6e2b90c2e pathauditor: move into workingcopy crate
Summary:
PathAuditor will be needed for native status soon. Let's move it into
the workingcopy crate.

Reviewed By: xavierd

Differential Revision: D20546906

fbshipit-source-id: ef69f88ee828a72e82b5e944cc7913f391bd8a2f
2020-04-22 19:55:49 -07:00
Durham Goode
faced01356 tracing: add more trace values
Summary: This will help us debug slow commands

Reviewed By: xavierd

Differential Revision: D21075895

fbshipit-source-id: 3e7667bb0e4426d743841d8fda00fa4a315f0120
2020-04-22 15:35:17 -07:00
Xavier Deguillard
51438d13e7 revisionstore: write data to store when reading from memcache
Summary:
The Memcache store is voluntarily added to the ContentStore read store, first
as a regular store, and then as a remote one. The regular store is added to
enable the slower remote store to write to it so that blobs are uploaded to
Memcache as we read them from the network. The subtle part of this is that the
HgIdDataStore methods should not do anything, or the data fetched won't be
written to any on-disk store, forcing a refetch next time the blob is needed.

Reviewed By: DurhamG

Differential Revision: D21132669

fbshipit-source-id: 96e963c7bb4209add5a51a5fc48bc38f6bcd2cd9
2020-04-21 18:35:38 -07:00
Mateusz Kwapich
7ca58332d0 fix xdiff behaviour for empty files
Summary:
When comparing empty file with file with content our xdiff wrongly included
warning about missing newline, which also made the line counter in the hunk
header off-by-one.

Empty files are quite rare in our repos, that's why I discovered this bug only
now (it broke phabricator parsing of this single commit).

Reviewed By: markedson1024

Differential Revision: D21141341

fbshipit-source-id: 9d3e0d8a61ac4ee2cf27978b99b3a092259ee186
2020-04-21 05:30:21 -07:00
Xavier Deguillard
1d231ebac7 revisionstore: return non-uploaded keys
Summary:
Ideally, either the ContentStore, or the upper layer should verify that we
haven't missed uploading a blob, which could lead to weird behavior down the
line. For now, all the stores will return the keys of the blobs that weren't
uploaded, which allows us to return these keys to Python.

Reviewed By: DurhamG

Differential Revision: D21103998

fbshipit-source-id: 5bab0bbec32244291c65a07aa2a13aec344e715e
2020-04-20 21:02:35 -07:00
Durham Goode
7edc29d07c filesystem: move rust walker to it's own file
Summary:
We'll be adding more data to the filesystem layer, so let's move this
out of lib.rs.

Also made a slight tweak to expose File metadata in the walk results, which will be used by the future pending changes logic to avoid re-stating the file.

Reviewed By: xavierd

Differential Revision: D20546903

fbshipit-source-id: 70456055b0da601990e6d6ff535678d2df6c50ba
2020-04-16 16:51:21 -07:00
Jun Wu
c2ffda7622 pager: configure streampager using hg configs
Summary:
This allows the streampager to be configured via hgrc files.

Default are picked so the behavior is closer to the current default pager
(`less -FRX`).

Reviewed By: DurhamG

Differential Revision: D20902034

fbshipit-source-id: 994ab963ceace02eeb1d18cfa5768e411ca3610b
2020-04-15 18:23:11 -07:00
Jun Wu
08d893fbf0 clidispatch: change pager to use stdio
Summary: This makes it work with chg, since `/dev/tty` is not available for chg.

Reviewed By: DurhamG

Differential Revision: D20936967

fbshipit-source-id: f3ded1aa5552f321ff7043a039f4e35a88160a51
2020-04-15 17:12:47 -07:00
Durham Goode
bfe0c8d7ab configs: add dynamic config generator
Summary:
We want Mercurial to become more responsible for it's own
configuration, instead of relying on chef and other means. To do so, let's
introduce a new `hg debugdynamicconfig` that can generate dynamic configs for
a given repository based on various states, like what tier it's in or what shard
that machine is in.  By default it generates to '.hg/hgrc.dynamic' for the given
repository.

Currently it just sets the hostgroup config.

Future diffs will make Mercurial consume this config, and possibly have Mercurial
call this command asynchronously when it notices the file is out-of-date.

Reviewed By: quark-zju

Differential Revision: D20828132

fbshipit-source-id: 6f5bf749f5b04e0a5989d6dc19ee788c2e47f88f
2020-04-14 21:22:26 -07:00
Durham Goode
232658cc81 configparser: add function for serializing configs
Summary:
A future diff will want to generate configs programmatically and write
them to a file. Let's add write support to ConfigSet.

Reviewed By: quark-zju

Differential Revision: D20828133

fbshipit-source-id: 702f6f9bdfdf99ef25c6e1c0ab33373a4b6508fe
2020-04-14 21:22:26 -07:00
Lukas Piatkowski
6afe62eeaa eden/scm: split revisionstore into types and rest of logic
Summary:
The revisionstore is a large crate with many dependencies, split out the types part which is most likely to be shared between different pieces of eden/mononoke infrastructure.

With this split it was easy to get eden/mononoke/mercurial/bundles

Reviewed By: farnz

Differential Revision: D20869220

fbshipit-source-id: e9ee4144e7f6250af44802e43221a5b6521d965d
2020-04-14 07:50:19 -07:00
Genevieve Helsel
a98dd1fa6b suggest graceful restart in hg status / old edenfs warnings
Summary: Since the old Edenfs warning is usually for simply picking up new eden releases, we can suggest the user runs a graceful restart instead of a normal restart to avoid them running into `Transport not connected` errors. This path is only hit in unix environments, so windows users will not see this (since graceful restart isn't supported there yet). Since this is a manual step as well, it will be easier for a user to see if they run into an issue here. This can also enable us to get more telemetry from users running graceful restarts.

Reviewed By: wez

Differential Revision: D20901597

fbshipit-source-id: 9e5c9a90313901be159f66afcbbadc5d7af4fe28
2020-04-13 16:25:58 -07:00
Jun Wu
73f93a438b py3: fix test-help.t
Reviewed By: xavierd

Differential Revision: D20953745

fbshipit-source-id: 7d6dd303e5789ae8c0a6a32ed40d884aa1f24803
2020-04-09 18:25:54 -07:00
Jun Wu
0b5a4ae2c3 cpython-ext: infer errno from io::ErrorKind
Summary:
Sometimes the Rust io::Error is generated without an errno (ex. pipe
0.2 would generate BrokenPipe error without an errno). The Python
land uses errno to check error type (Python does not have io::ErrorKind).

Therefore attempt to translate ErrorKind to Python errno. Without this
exiting the rust pager early would crash like:

  StdioError: [Errno None] pipe reader has been dropped
  abort: pipe reader has been dropped

Reviewed By: markbt

Differential Revision: D20898559

fbshipit-source-id: ef863617e0e500d878ea0f9aeac06b4d87ffbcf2
2020-04-08 13:13:08 -07:00
Lukas Piatkowski
c7d12b648f mononoke/mercurial: make revlog crate OSS buildable
Reviewed By: krallin

Differential Revision: D20869309

fbshipit-source-id: bc234b6cfcb575a5dabdf154969db7577ebdb5c5
2020-04-08 09:49:11 -07:00
Jun Wu
4d951d777b util: add more APIs for pytracing
Summary: This makes the tracing features easier to use.

Reviewed By: DurhamG

Differential Revision: D19797703

fbshipit-source-id: fb5cb17cd389575cf0134a708bcd9df3b90e9ab4
2020-04-07 21:35:31 -07:00
Xavier Deguillard
018e33237a revisionstore: upload local blobs
Summary:
On upload, read all the local blobs and upload them to the LFS server. This is
necessary to support `hg push` or `hg cloud sync` for local LFS blobs.

One of the change made here is to switch from having the batch method return an
Iterator to having them take a callback. This made it easier to write the gut
of the batch implementation in a more generic way.

A future change will also take care of moving local blobs to the shared store
after upload.

Reviewed By: DurhamG

Differential Revision: D20843136

fbshipit-source-id: 92d34a0971263829ff58e137e9905b527e18358d
2020-04-07 16:53:34 -07:00
Xavier Deguillard
0dca734464 revisionstore: add upload to RemoteDataStore
Summary: This method will be used to upload local LFS blobs to the LFS server.

Reviewed By: DurhamG

Differential Revision: D20843137

fbshipit-source-id: 33a331c42687c47442189ee329da33cb5ce4d376
2020-04-07 16:53:34 -07:00
Xavier Deguillard
80b222c01b revisionstore: use loose files for file backed LFS remote
Summary:
Loose files makes it easier to interact with a Mercurial server for tests, use
it instead of an IndexedLog.

Reviewed By: DurhamG

Differential Revision: D20786432

fbshipit-source-id: 61c1fc601d9a6ed157c5add9748e40840b081870
2020-04-07 16:53:34 -07:00
Jun Wu
a4a21b72d1 pager: add bindings to expose Rust's pager features to Python
Summary:
This exposes the Rust's pager to Python. Right now it's using the system
terminal.

Reviewed By: DurhamG

Differential Revision: D20887174

fbshipit-source-id: c72f31a58475e76f8097c515dd29f911d2ac4df1
2020-04-07 15:57:07 -07:00
Jun Wu
817f748113 debugindexedlogdump: output in a streaming way
Summary:
Do not convert the entire output to a string. This makes `debugindexedlog dump`
a good test case for native pager support - it takes a while to write the full
output for a large input.

Reviewed By: DurhamG

Differential Revision: D20885567

fbshipit-source-id: 35ed8f68dff1916f0833577c3cf2a52cbf2a658c
2020-04-07 15:57:06 -07:00
Jun Wu
52bb1ab77e clidispatch: add start_pager API for IO
Summary:
Implement the core API to start pager in native Rust. For now it is only
enabled for the entire command if `--pager=always` is set.

Reviewed By: DurhamG

Differential Revision: D20849644

fbshipit-source-id: 860b4e18d841da607864c3447d78dbac126f5f18
2020-04-07 15:57:06 -07:00
David Tolnay
1a86366f0e third-party/rust: Turn off async-trait/support_old_nightly
Summary:
This diff turns off the support_old_nightly feature of async-trait (https://github.com/dtolnay/async-trait/blob/0.1.24/Cargo.toml#L28-L32) everywhere in fbcode. I am getting ready to remove the feature upstream. It was an alternative implementation of async-trait that produces worse error messages but supports some older toolchains dating back to before stabilization of async/await that the default implementation does not support.

This diff includes updating async-trait from 0.1.24 to 0.1.29 to pull in fixes for some patterns that used to work in the support_old_nightly implementation but not the default implementation.

Differential Revision: D20805832

fbshipit-source-id: cd34ce55b419b5408f4f7efb4377c777209e4a6d
2020-04-02 17:01:24 -07:00
Xavier Deguillard
2ce3c6784b revisionstore: make ContentStore::sha256 infallible
Summary:
Instead of using Sha256::from_slice, just use Sha256::from with a correctly
sized array.

Reviewed By: quark-zju

Differential Revision: D20756181

fbshipit-source-id: 17c869325025078e4c91a564fc57ac1d9345dd15
2020-03-31 08:12:45 -07:00
Adam Simpkins
b31c991c00 update the hg status fast-path to print errors reported by EdenFS
Summary:
Update `hg status` to print errors that were returned by EdenFS's
getScmStatusV2() call, and to exit unsuccessfully if there were any errors.

Previously errors were silently ignored.

Reviewed By: quark-zju

Differential Revision: D19958503

fbshipit-source-id: cb3109df40eb86a5bf7e3818ddfb8da74d670405
2020-03-30 19:24:22 -07:00
Adam Simpkins
ba42390257 fix hg status path relativization in tests on Windows
Summary:
Fix the `test_status()` function to properly canonicalize the input paths, so
that it works successfully on Windows.  Previously this function would fail on
Windows as some paths would end up as UNC-style paths and some would be plain
paths, causing the equality comparison to fail even though the paths were
equivalent.  This canonicalizes the repo path so that they both use the same
format.

Reviewed By: quark-zju

Differential Revision: D20662921

fbshipit-source-id: fdd36bac755f9694b4a482615d3dca43ff21e05e
2020-03-30 19:24:22 -07:00
Xavier Deguillard
22f524664e revisionstore: use Arc<Self> as receiver for remote store traits
Summary:
All of the callers are already using an Arc, so instead of forcing the remote
store to be cloneable, and thus wrap an inner self with an Arc, let's just pass
self as an Arc.

Reviewed By: DurhamG

Differential Revision: D20715580

fbshipit-source-id: 1bef23ae7da7b314d99cb3436a94d04134f1c0e4
2020-03-30 14:45:49 -07:00
Xavier Deguillard
6b1940f294 revisionstore: add fallback if LFS server is down
Summary:
When LFS will be enabled on fbsource, the enablement will rolled out server,
with the server serving pointers (or not). In the catastrophic scenario where
Mononoke has to be rolled out, the Mononoke LFS server will be unable to serve
blobs, but some clients may still have LFS pointers locally but not the
corresponding blob. For this, we need to be able to fallback to fetching the
blob via the getpackv2 protocol.

Reviewed By: DurhamG

Differential Revision: D20662667

fbshipit-source-id: 4ac45558f6d205cbd1db33c21c6fb137a81bdbd5
2020-03-30 14:45:48 -07:00
Xavier Deguillard
e7b5eb81d9 remotefilelog: add lfs fetching retry logic
Summary:
The LFS server might be temporarily having issues, let's retry a bit before
giving up.

Reviewed By: DurhamG

Differential Revision: D20686659

fbshipit-source-id: 90dabd19e45a681d6eae5cd50c72b635d44c0517
2020-03-30 14:45:48 -07:00
Xavier Deguillard
f06132def2 configparser: add FromConfigValue for floats
Summary:
Since we have all the integer types, let's also allow float types in the
config.

Reviewed By: kulshrax

Differential Revision: D20697007

fbshipit-source-id: 21fa264d24c0f63c233f47c3bcfb2448b4c05c70
2020-03-30 14:45:48 -07:00
Xavier Deguillard
15c81cd54f revisionstore: repack even when having one packfile
Summary:
When repacking for the purpose of file format changes, a single packfile may
contain data that needs to be moved out of it, and thus, we need to do a repack
then.

Reviewed By: DurhamG

Differential Revision: D20677442

fbshipit-source-id: c621dd2e657f5a4565b37d4b029731415b899117
2020-03-30 14:45:47 -07:00
Xavier Deguillard
e0cfd08e7f revisionstore: properly implement get_missing for remote stores
Summary:
Remotestores can implement get_missing properly by simply querying the
underlying store that they will be writing to. This may prevent double fetching
some blobs in `hg prefetch` that we already have.

Reviewed By: DurhamG

Differential Revision: D20662668

fbshipit-source-id: 22140b5b7200c687e0ec723dd8879dc8fbea6fb9
2020-03-30 14:45:47 -07:00
Xavier Deguillard
e1ccc29eec revisionstore: add an is_local method to the IndexedLog abstraction
Summary:
There are cases where the user of the abstraction needs to know if this is a
local store, this will simplify the caller code.

Reviewed By: DurhamG

Differential Revision: D20662666

fbshipit-source-id: e0bde7eb0dc3484979732a7c4cdf888fedc70e13
2020-03-30 14:45:46 -07:00
Xavier Deguillard
8ed454845a revisionstore: auto-sync the LFS blobs store
Summary:
By regularly flushing the blob store, we avoid keeping too many LFS blobs in
memory, which could cause OOM issues.

The default size is chosen to be 1GB, but is configurable for more control.

Reviewed By: DurhamG

Differential Revision: D20646213

fbshipit-source-id: 12c06fd0212ef3974bea10c82026b6e74fb5bf21
2020-03-30 14:45:46 -07:00
Xavier Deguillard
5ea2e581c6 revisionstore: store LFS blobs in an IndexedLog
Summary:
In the legacy lfs extension, LFS blobs were stored as loosefiles on disk, and
as we saw with loosefiles for remotefilelog, they can incur a significant
overhead to maintain. Due to LFS blobs being large by definition, the number of
loose LFS blobs should be reasonable for repack to walk over all of them to
chose which one to throw away.

A different approach would be to simply store the blobs in an on-disk format
that allows automatic size management, and simple indexing. That format is an
IndexedLog. This of course doesn't come without drawbacks, the main one being
that the IndexedLog API mandate that the full blob is present on insertion,
preventing streaming writes to it, the solution is to simply chunk the blobs
before writing them to it. While proper streaming is not done just yet, the
storage format no longer prevent it from being implemented.

Reviewed By: DurhamG

Differential Revision: D20633783

fbshipit-source-id: 37a88331e747cf22511aa348da2d30edfa481a60
2020-03-30 14:45:46 -07:00
Arun Kulshreshtha
91e76774fe edenapi: fix typo in logging
Reviewed By: sfilipco

Differential Revision: D20744831

fbshipit-source-id: 194e550d88c33ed9601d2f24fea281996696087b
2020-03-30 14:39:42 -07:00
Jun Wu
6bddc1df08 rotatelog: avoid loading broken Logs multiple times
Summary:
RotateLog loads older logs lazily. If an older log is broken, remember that and avoid
loading the broken log again.

Reviewed By: DurhamG

Differential Revision: D20663899

fbshipit-source-id: 7a4b5279cc6387c19329a51048bfe1be2e0bc1f8
2020-03-30 11:34:49 -07:00
Xavier Deguillard
fd72344578 revisionstore: feature gate the Mononoke LFS tests
Summary:
Due to the Mononoke LFS server only being available on FB's network, the tests
using them cannot run outside of FB, including in the github workflows.

Reviewed By: quark-zju

Differential Revision: D20698062

fbshipit-source-id: f780c35665cf8dc314d1f20a637ed615705fd6cf
2020-03-30 08:40:43 -07:00
Stefan Filip
ea89b541e1 segmented_changelog: add Dag struct and location_to_name functionality
Summary:
The IdDag provides graph algorithms using Segments.
The IdMap allows converting from the SegmentedChangelogId domain to the
ChangesetId domain.
The Dag struct wraps IdDag and IdMap in order to provide graph algorithms using
the common application level identifiers for commits (ChangesetId).

The construction of the Dag is currently mocked with something that can only be
used in a test environment (unit tests but also integration tests).

This diff also implements a location_to_name function. This is the most
important new functionality that segmented changelog clients require. It
recovers the hash of a commit for which the client only has a segmented
changelog Id. The current assumption is that clients have identifiers for all
merge commit parents so the path to a known commit always follow a set
of first parents.

The IdMap queries will have to be changed to async in the future, but IdDag
queries we expect to stay sync.

Reviewed By: quark-zju

Differential Revision: D20635577

fbshipit-source-id: 4f9bd8dd4a5bd9b0de55f51086f3434ff507963c
2020-03-27 13:48:52 -07:00
Stefan Filip
7502ce31ca dag: add in process stored IdMap constructor
Summary: The interesting observation is that InProcessStore is not public.

Reviewed By: quark-zju

Differential Revision: D20635578

fbshipit-source-id: a0149929c8059ff77f047fd385bf3b26dc738dfd
2020-03-27 13:48:51 -07:00
Lukas Piatkowski
bace9bc0ae mononoke: make filenode and mercurial_types OSS buildable
Reviewed By: farnz

Differential Revision: D20281345

fbshipit-source-id: bc4910d3040d74c7ceb4cb825bea6960952f6310
2020-03-27 11:40:13 -07:00
Xavier Deguillard
179caf60dd revisionstore: pass the config when building the LFS stores
Summary: Instead of hardcoding some values, use configured ones.

Reviewed By: DurhamG

Differential Revision: D20633784

fbshipit-source-id: d17657b1fca29280f2d07e8cf824e553ad3704f7
2020-03-26 19:02:49 -07:00
Xavier Deguillard
d404b0a228 revisionstore: revamp repack to ease file format migration
Summary:
One of the main drawback of the current version of repack is that it writes
back the data to a packfile, making it hard to change file format. Currently, 2
file format changes are ongoing: moving away from packfiles entirely, and
moving from having LFS pointers stored in the packfiles, to a separate storage.

While an ad-hoc solution could be designed for this purpose, repack can
fullfill this goal easily by simply writing to the ContentStore, the
configuration of the ContentStore will then decide where this data will
be written into.

The main drawback of this code is the unfortunate added duplication of code.
I'm sure there is a way to avoid it by having new traits, I decided against it
for now from a code readability point of view.

Reviewed By: DurhamG

Differential Revision: D20567118

fbshipit-source-id: d67282dae31db93739e50f8cc64f9ecce92d2d30
2020-03-26 19:02:48 -07:00
Xavier Deguillard
d4780ee322 revisionstore: store a HashMap of ContentHash in LFS pointer
Summary:
While the primary (for now) way of addressing an LFS blob is via its sha256,
being able to address them via different hash schemes (sha1 for Eden/Buck,
blake2, etc) will be helpful down the line. Thus, let's store a HashMap of
ContentHash in the pointer store.

Reviewed By: DurhamG

Differential Revision: D20560197

fbshipit-source-id: 8bdc4fc4cd7fc19c7eed6a27d11953c4eedf9195
2020-03-26 19:02:47 -07:00
Xavier Deguillard
2f314fc43c revisionstore: remove unnecessary lock around the LfsBlobsStore
Summary: No locking is required for this one due to being loose files on disk.

Reviewed By: DurhamG

Differential Revision: D20522890

fbshipit-source-id: 72b7ebc063060a89f54976a1128977a3b7501053
2020-03-26 19:02:47 -07:00
Xavier Deguillard
6372a4a4fc revisionstore: add an is_lfs method to Metadata
Summary:
Instead of having the magic number 0x2000 all over the place, let's move the
logic to this method.

Reviewed By: DurhamG

Differential Revision: D20637749

fbshipit-source-id: bf666f8787e37e6d6c58ad8982a5679b7e3e717b
2020-03-25 12:29:25 -07:00
Stefan Filip
c400809eba dag: rename child index iteration to iter_master_flat_segments_with_parent
Summary:
`iter_segments_with_parent` has a few more conditions attached to it than the
name would imply. We are renaming it to give a better sense of its true
behavior.

Reviewed By: quark-zju

Differential Revision: D20547631

fbshipit-source-id: 406f46b9de5efc9e8e6a8c4bc22ab18fa5bc54bb
2020-03-24 13:58:07 -07:00
Stefan Filip
59ff2a8571 dag: remove_non_master implementation for
Summary: Also adding better tests for non master entries.

Reviewed By: quark-zju

Differential Revision: D20504483

fbshipit-source-id: 60d4a20aecb00f7750db2fff5d3832aac99d00e2
2020-03-24 13:58:06 -07:00
Stefan Filip
03c1e1cac5 dag: iterator implementations for InProcessStore
Summary:
The main question I had while writing the tests was whether we expect a
specific order for Segments for `iter_segments_with_parent`. `InProcessStore`
will return the segments in the order that they were inserted.

Reviewed By: quark-zju

Differential Revision: D20501401

fbshipit-source-id: 48ceb78f3191c7425c1488a3392cf3167f7e7268
2020-03-24 13:58:06 -07:00
Stefan Filip
5f4e706f81 dag: Add InProcessStore as iddagstore
Summary:
First 6 methods implemented from the IdDagStore trait for the InProcessStore.

Any suggestions welcome.

Reviewed By: quark-zju

Differential Revision: D20499228

fbshipit-source-id: cb536a3a0136077ada78934d82a25d079a5bc809
2020-03-24 13:58:06 -07:00
Stefan Filip
3dcb56535e dag: add descriptions to IdDagStore methods
Summary: Documentation.

Reviewed By: quark-zju

Differential Revision: D20499926

fbshipit-source-id: ebbb7a1249109bd56ff459a659e0c628c2974179
2020-03-24 13:58:05 -07:00
Steven Troxler
d90435ced9 Deprecate rust-crypto in eden/scm/lib/revisionstore
Summary:
Replace `rust-crypto` with `hex`, `sha-1`, `sha2`.
 - `crypto::sha1::Sha1` with `sha1::Sha1`
 - `crypto::sha2::Sha2` with `sha2::Sha2`
 - `crypto::digest::Digest` with `sha1::Digest` and `sha2::Digest`
 - `.result_str()` with `hex::encode` and `.result()`

Reviewed By: jsgf

Differential Revision: D20588313

fbshipit-source-id: 75c4342e8b6285f0f960f864c21457a1a0808f64
2020-03-23 16:38:07 -07:00
Xavier Deguillard
674e9c7900 fsinfo: return an enum instead of a String
Summary:
In a strongly typed langage, using strings should be avoided whenever possible
as they do not provide the safety guarantees that types provide.

I took the liberty of removing all the filesystems that are not relevant for
Mercurial for simplification reasons. If needs arise, we can always add a new
FsType to the enum.

Reviewed By: DurhamG

Differential Revision: D20517138

fbshipit-source-id: 0a38b53c6a87f05f4b2d664038e10c4293de96ae
2020-03-23 14:29:10 -07:00
Steven Troxler
b6de099da8 Deprecate rust-crypto in zstore
Summary:
Replace `rust-crypto` with `sha-1`:
 - `crypto::digest::Digest` with `sha1::Digest`
 - `crypto::Sha1` with `sha1::Sha1`

The interface changes slightly - no need to pass a mutable byte array when
getting the result.

Reviewed By: jsgf

Differential Revision: D20587638

fbshipit-source-id: c6c737f3f8eba94b98c728e198eb4fac12c5c80b
2020-03-23 11:29:09 -07:00
Steven Troxler
71279b3916 Deprecate rust-crypto in manifest-tree
Summary:
Replace `rust-crypto` with `sha-1`:
 - `crypto::digest::Digest` with `sha1::Digest`
 - `crypto::sha1::Sha1` with `sha1::Sha1`

Reviewed By: jsgf

Differential Revision: D20587716

fbshipit-source-id: de801c20bffd356eb5b2205a63ec0218b3aca6c0
2020-03-23 11:15:58 -07:00
Steven Troxler
154c458285 Deprecate rust-crypto in eden/scm/lib/types
Summary:
Swap out `rust-crypto` for `sha-1`
 - `crypto::sha1::Sha1` is replaced by `sha1::Sha1`
 - `crypto::digest::Digest` is replaced by `digest::Digest`

Reviewed By: jsgf

Differential Revision: D20587685

fbshipit-source-id: 971fdaa8ce5b3e9e60db219131f6c36dcbc213d9
2020-03-23 11:15:57 -07:00
Steven Troxler
3534ca3bb8 Deprecate rust-crypto in edenfs-client
Summary:
Switched out the `sha` package for the `rust-crypto` package. The
apis aren't an exact match, so I had to insert a clone in place of
a modification to a mutable reference.

Reviewed By: jsgf

Differential Revision: D20585336

fbshipit-source-id: 22245157aea1115ae6f225b17b0346f0696653f7
2020-03-23 11:04:36 -07:00
Xavier Deguillard
767134797c pyerror: stringify Rust errors with "{:?}"
Summary:
According to the anyhow documentation[0], the behavior of `.to_string()` is to
only stringify the top-level errors, hiding all the context of the error.
Instead, the debug format allows all the context to be displayed, and, if
available the backtrace.

This should significantly help debug Rust errors when context is available,
which we should strive to have everywhere!

[0]: https://docs.rs/anyhow/1.0.27/anyhow/struct.Error.html#display-representations

Reviewed By: sfilipco

Differential Revision: D20575944

fbshipit-source-id: 2968d7fb755edec7f7e5151138e8049ded181c1b
2020-03-20 20:22:14 -07:00
Lukas Piatkowski
12f639159e cargo_from_buck: get rid of signatures in generated Cargo.toml files
Summary: The signatures were used by the linter to warn if the files require regenerating, since the linter now regenerates the files regardless of the signature it is no longer needed to sign the files.

Reviewed By: krallin

Differential Revision: D20467745

fbshipit-source-id: aff2643f80939d5693e7a30abf07484c9060796f
2020-03-20 08:56:11 -07:00
Xavier Deguillard
68edce4365 lfs: allow the LFS remote to be a local directory
Summary:
This is only intended for Mercurial .t tests and not in any production
environment.

Reviewed By: DurhamG

Differential Revision: D20504236

fbshipit-source-id: 618e17631b73afa650875cb7217ba7c55fb9f737
2020-03-19 14:36:19 -07:00
Xavier Deguillard
092cfcec7d revisionstore: add a ContentDataStore trait
Summary:
For now, this is only used for LFS, as this is the only store that can
correctly answer both.

This API will be exposed to Python to be able to have cheap filectx comparison,
and other use cases.

Reviewed By: DurhamG

Differential Revision: D20504234

fbshipit-source-id: 0edb912ce479eb469d679b7df39ba80fceef05f2
2020-03-19 14:36:18 -07:00
Xavier Deguillard
632bd53a02 revisionstore: add a LFS remote store
Summary:
This enables fetching blobs from the LFS server. For now, this is limited to
fetching them, but the protocol specify ways to also upload. That second part
will matter for commit cloud and when pushing code to the server.

One caveat to this code is that the LFS server is not mocked in tests, and thus
requests are done directly to the server. I chose very small blobs to limit the
disruption to the server, by setting a test specific user-agent, we should be
able to monitor traffic due to tests and potentially rate limit it.

Reviewed By: DurhamG

Differential Revision: D20445628

fbshipit-source-id: beb3acb3f69dd27b54f8df7ccb95b04192deca30
2020-03-19 14:36:18 -07:00
Jun Wu
6ffdcebadf tracing: write some blackbox events as tracing events
Summary:
This is the start of migrating blackbox events to tracing events. The
motivation is to have a single data source for log processing (for simplicity)
and the tracing data seems a better fit, since it can represent a tree of
spans, instead of just a flat list. Eventually blackbox might be mostly
a wrapper for tracing data, with some minimal support for logging some indexed
events.

Reviewed By: DurhamG

Differential Revision: D19797710

fbshipit-source-id: 034f17fb5552242b60e759559a202fd26061f1f1
2020-03-19 10:23:24 -07:00
Jun Wu
8cc30ac302 dag: add Segment::new API
Summary:
Now Segment has no lifetime we can create it directly and return the ownership.

Performance of "building segments" does not seem to change:

  # before
  building segments                                 750.129 ms

  # after
  building segments                                 712.177 ms

Reviewed By: sfilipco

Differential Revision: D20505200

fbshipit-source-id: 2448814751ad1a754b90267e43262da072bf4a16
2020-03-18 15:05:58 -07:00
Jun Wu
1bd54a5971 dag: drop lifetime on Segment<'a>
Summary:
This allows structures like BTreeMap to own and store Segment.

It was not possible until D19818714, which adds minibytes::Bytes interface for
indexedlog.

In theory this hurts performance a little bit. But the perf difference does not
seem visible by `cargo bench --bench dag_ops`:

  # before
  building segments                                 714.420 ms
  ancestors                                          54.045 ms
  children                                          490.386 ms
  common_ancestors (spans)                            2.579 s
  descendants (small subset)                        406.374 ms
  gca_one (2 ids)                                   161.260 ms
  gca_one (spans)                                     2.731 s
  gca_all (2 ids)                                   287.857 ms
  gca_all (spans)                                     2.799 s
  heads                                             234.130 ms
  heads_ancestors                                    39.383 ms
  is_ancestor                                       113.847 ms
  parents                                           251.604 ms
  parent_ids                                         11.412 ms
  range (2 ids)                                     117.037 ms
  range (spans)                                     241.156 ms
  roots                                             507.328 ms

  # after
  building segments                                 750.129 ms
  ancestors                                          53.341 ms
  children                                          515.607 ms
  common_ancestors (spans)                            2.664 s
  descendants (small subset)                        411.556 ms
  gca_one (2 ids)                                   164.466 ms
  gca_one (spans)                                     2.701 s
  gca_all (2 ids)                                   290.516 ms
  gca_all (spans)                                     2.801 s
  heads                                             240.548 ms
  heads_ancestors                                    39.625 ms
  is_ancestor                                       115.735 ms
  parents                                           239.353 ms
  parent_ids                                         11.172 ms
  range (2 ids)                                     115.483 ms
  range (spans)                                     235.694 ms
  roots                                             506.861 ms

Reviewed By: sfilipco

Differential Revision: D20505201

fbshipit-source-id: c34d48f0216fc5b20a1d348a75ace89ace7c080b
2020-03-18 15:05:57 -07:00
Xavier Deguillard
db310fc87f revisionstore: replace lazy_init with once_cell
Summary:
The later is what is now recommended, and no longer requires a macro to
initialize a lazy value, leading to nicer code.

Reviewed By: DurhamG

Differential Revision: D20491488

fbshipit-source-id: 2e0126c9c61d0885e5deee9dbf112a3cd64376d6
2020-03-18 12:20:12 -07:00
Xavier Deguillard
9c8633bb0a revisionstore: address clippy warnings
Summary:
Lots of different warnings on this one. Main ones were:
 - One bug where .write was used instead of .write_all
 - Using .next instead of .nth(0) for iterators,
 - Using .cloned() instead of .map(|x| x.clone())
 - Using conditions as expressions instead of mut variables
 - Using .to_vec() on slices instead of .iter().cloned().collect().
 - Using .is_empty instead of comparing .len() against 0.

Reviewed By: DurhamG

Differential Revision: D20469894

fbshipit-source-id: 3666a44ad05e0fbfa68d490595703c022073af63
2020-03-18 10:16:39 -07:00
Xavier Deguillard
a760c0e672 edenapi: address clippy warnings
Reviewed By: DurhamG

Differential Revision: D20469646

fbshipit-source-id: 222f75196ef140c2e9bdfc0a0500f3fbcffb2309
2020-03-18 10:16:39 -07:00
Xavier Deguillard
121e524df9 blackbox: address clippy warnings
Reviewed By: DurhamG

Differential Revision: D20469649

fbshipit-source-id: 99b0e68259b5e2ed5b1c969d0a5fa8473e899f17
2020-03-18 10:16:39 -07:00
Xavier Deguillard
aae9075762 lz4-pyframe: address clippy warnings.
Reviewed By: DurhamG

Differential Revision: D20469648

fbshipit-source-id: 346c8a23ff2b4a895a066843ebe5341103956e76
2020-03-18 10:16:38 -07:00
Xavier Deguillard
8c1f033f50 indexedlog: address clippy warnings
Summary:
These were from a wide variety of warnings. The only one I haven't addressed is
that clippy complains that Pin<Box<Vec<u8>>> can be replaced by Pin<Vec<u8>>. I
haven't investigated too much into it, someone more familiar with this code can
probably figure out if this is buggy or not :)

Reviewed By: DurhamG

Differential Revision: D20469647

fbshipit-source-id: d42891d95c1d21b625230234994ab49bbc45b961
2020-03-18 10:16:38 -07:00
Xavier Deguillard
42f1213efa util: address clippy warnings
Summary: The lifetime is unecessary.

Reviewed By: DurhamG

Differential Revision: D20452750

fbshipit-source-id: 184f5e109a0ff59931bdddaf611a7581d2255e78
2020-03-18 09:35:36 -07:00
Jun Wu
e48079180f indexedlog: fix a typo in benchmarks
Summary:
This belongs to D20149376. However buck test does not include benchmarks so it
was not noticed.

Reviewed By: DurhamG

Differential Revision: D20505097

fbshipit-source-id: 24daeb17b68808f8e69e18452ab2cf26c7aa10a7
2020-03-18 09:30:31 -07:00
Mark Thomas
5666399fcf mutationstore: switch mutation entry timestamp from f64 to i64
Summary:
The mutation store stores entries with a floating-point timestamp.  This
pattern was copied from obsmarkers.

However, Mercurial uses integer timestamps in the commit metadata (the
parser supports floats for historical reasons, but only stores integer
timestamps).   Mononoke also uses integer timestamps in its `DateTime`
type.

To keep things simple, switch to using integer timestamps for mutation
entries.  Existing entries with floating point timestamps are truncated.

Add a new entry format version that encodes the timestamp as an integer.
For now, continue to generate the old version so that old clients can
read entries created by new clients.

Reviewed By: quark-zju

Differential Revision: D20444366

fbshipit-source-id: 4d6d9851aacb314abea19b87c9d0130c47fdf512
2020-03-17 04:18:44 -07:00
Mark Thomas
ac80212e8f mutationstore: remove mutation entry origins
Summary:
Tracking the origin of mutation entries did not prove useful, and just creates
an un-necessary overhead.  Remove the tracking and repurpose the field as a
version field.

Reviewed By: quark-zju

Differential Revision: D20444365

fbshipit-source-id: 65ff11ee8cfe77d5e67a83d03a510541d58ef69b
2020-03-17 04:18:44 -07:00
Xavier Deguillard
deffd9a477 minibytes: address clippy warnings
Summary: Using ptr.add is shorter and preferred to ptr.offset.

Reviewed By: quark-zju

Differential Revision: D20452752

fbshipit-source-id: 1dc2fdbc392267d2d690673c10dcc161ecd00dfa
2020-03-16 14:58:22 -07:00
Xavier Deguillard
67c8cf22a3 hgtime: address clippy warnings
Summary:
These warnings are fairly trivial, as it recommends using single quote (char)
for single characters search instead of a double quote (str).

Reviewed By: quark-zju

Differential Revision: D20452408

fbshipit-source-id: b2951e133e57633a8e766536e22969fa9ac0ecee
2020-03-16 14:58:22 -07:00
Xavier Deguillard
bb30c40375 types: address clippy warnings
Summary:
Clippy had 3 sources of warnings in this crate:
 - from_str method not in impl FromStr. We still have 2 of them in path.rs, but
   this is documented as not supported by the FromStr trait due to returning a
   reference. Maybe we can find a different name?
 - Use of mem::transmute while casts are sufficient. I find the cast to be
   ugly, but they are simply safer as the compiler can do some type checking on
   them.
 - Unecessary lifetime parameters

Reviewed By: quark-zju

Differential Revision: D20452257

fbshipit-source-id: 94abd8d8cd76ff7af5e0bbfc97c1e106cdd142b0
2020-03-16 14:58:21 -07:00
Xavier Deguillard
82d3c7f544 configparser: address clippy warnings
Summary:
Clippy complains about 3 things:
 - Using raw pointers in a public function that is not declared as unsafe. This
   happens for C exported ones, this feels like a warning, so I haven't changed
   it.
 - Using .map(...).unwrap_or(<default value constructed>). The recommendation
   is to use .unwrap_or_default().
 - Single match instead of if let, the latter makes code much shorter.

Reviewed By: quark-zju

Differential Revision: D20452751

fbshipit-source-id: 8eeff7581c119c651ca41d8117f1f70f15774833
2020-03-16 14:53:45 -07:00
Stefan Filip
1fb5acf242 dag: use IdDagStore in IdDag with type parameter
Summary: Make IdDag storage generic by depending on IdDagStore.

Reviewed By: quark-zju

Differential Revision: D20471712

fbshipit-source-id: 3a2668f301758a3c880db35c9f0db6887ef1dd38
2020-03-16 14:41:41 -07:00
Stefan Filip
236292c0fd dag: add the GetLock trait
Summary: Used to generalize `get_lock` functionality.

Reviewed By: quark-zju

Differential Revision: D20471710

fbshipit-source-id: e44d5b22ecacdb653170ef83914354f521f82dfc
2020-03-16 14:41:40 -07:00
Stefan Filip
66436b4a3c dag: add the IdDagStore trait
Summary: Abstract the storage functionality required by IdDag.

Reviewed By: quark-zju

Differential Revision: D20449122

fbshipit-source-id: fc3c7d7b88d74f7a93670d310be2e680f35e8ce7
2020-03-16 14:41:40 -07:00
Stefan Filip
1239628ef8 dag: move IdDag storage details to the iddagstore module
Summary:
Right now the module has one implementation IndexedLogStore. The name could
be more specific in the context of the crate.

The goal will be to add a trait for storage requirements of IdDag and
make IndexedLogStorage one implementation of that trait.

Reviewed By: quark-zju

Differential Revision: D20446042

fbshipit-source-id: 7576e1cc4ad757c1a2c00322936cc884838ff710
2020-03-16 14:41:40 -07:00
Jun Wu
1f64b4ec50 nameset: fix LazySet iteration
Summary:
The `next` method forgot to increase the iteration index, causing infinite
iteration.

Reviewed By: ikostia

Differential Revision: D20473206

fbshipit-source-id: 82a95de1b1c12ac4e9e4d328a0adba7145d7b24c
2020-03-16 13:00:35 -07:00
Jun Wu
8115053c00 indexedlog: implement xxd-like fmt::Debug for Log
Summary: This makes `hg debugindexedlog dump` more useful.

Reviewed By: sfilipco

Differential Revision: D20448863

fbshipit-source-id: c5cc24449ae00ee329ce02bf0adf947ff57e72ed
2020-03-16 10:21:46 -07:00
Durham Goode
a13fcd4910 workingcopy: support returning directories from the walker
Summary:
Purge needs to be able to see what directories the walker traversed, so
it can delete them if they are empty. Instead of having the walker call
match.traversedir (which it seems like a bizarre pattern to use the matcher as a
holder for a non-matching related function), let's have the walker return an
enum and have an option to return directories.

At the python layer we then translate this into match.traversedir calls, but we
can clean that up later.

Reviewed By: quark-zju

Differential Revision: D19543795

fbshipit-source-id: cc51c86c91799d3df2c65d25a7b6cfe810206d0a
2020-03-16 10:15:26 -07:00
Durham Goode
fc7739fa26 workingcopy: rename walker results
Summary:
In preparation for supporting returning directories from the walker (to
support purge), let's rename the result structure to be more generic.

Reviewed By: kulshrax

Differential Revision: D19543791

fbshipit-source-id: 9b71452c879cf397ae92533a4ef4727140ac7369
2020-03-16 10:15:26 -07:00
Durham Goode
05e09b2b89 workingcopy: report invalid file types from rust walker
Summary:
The mercurial tests print errors when they encounter 'fifo' files.
Let's handle that case.

Differential Revision: D19543796

fbshipit-source-id: f87d4b9c3f0ad8b8d8ebe2e6d18e325fc93d0ae9
2020-03-16 10:15:25 -07:00
Xavier Deguillard
fd8d92f1f5 revisionstore: allow indexing LFS pointers via sha256
Summary:
While the sha256 of a blob gives access to its content, it doesn't allow
accessing its metadata, by adding a sha256 index, we can easily get the
metadata of a blob via its content hash.

Reviewed By: quark-zju

Differential Revision: D20445624

fbshipit-source-id: 42c04bd69d3c7380706c6237c5b4f4061c016cca
2020-03-13 19:03:29 -07:00
Xavier Deguillard
d9cca63444 types: add a into_inner method to Sha256
Reviewed By: quark-zju

Differential Revision: D20445623

fbshipit-source-id: d9cba7ddd16a8e89c76cd5e988ab0fb79383d0c2
2020-03-13 19:03:29 -07:00
Xavier Deguillard
60be0ac94d types: fix typo when displaying Sha256
Reviewed By: quark-zju

Differential Revision: D20445622

fbshipit-source-id: dc9a8a165ca55fdece90a5eb3a87cd3c28f444cb
2020-03-13 19:03:29 -07:00
Xavier Deguillard
6ee3a8f42f revisionstore: add metadata to FakeHgIdRemoteStore
Summary: This is necessary to properly test LFS stores.

Reviewed By: quark-zju

Differential Revision: D20445625

fbshipit-source-id: 530ddf87249e8d721957806f2d8edef3262f303c
2020-03-13 19:03:28 -07:00
Xavier Deguillard
5002d01e0a revisionstore: allow indexedlogutil users to lookup in different indices
Summary:
The OpenOptions allow for multiple indices to be added, but lookup had no way
to querying these multiple indices.

Reviewed By: quark-zju

Differential Revision: D20445627

fbshipit-source-id: 0cb754ba17b452d892b7bcb56d502d5753ef963a
2020-03-13 19:03:28 -07:00
Xavier Deguillard
01fb3c0a77 revisionstore: add a new StoreKey type
Summary:
This type can either be a Mercurial type key, or a content hash based key. Both
the prefetch and get_missing now can handle these properly. This is essential
for stores where data can either be fetched in both ways or when the data is
split in 2. For LFS for instance, it is possible to have the LFS pointer (via
getpackv2), but not the actual blob. In which case get_missing will simply
return the content hash version of the StoreKey, to signify what it actually
has missing.

Reviewed By: quark-zju

Differential Revision: D20445631

fbshipit-source-id: 06282f70214966cc96e805e9891f220b438c91a7
2020-03-13 19:03:28 -07:00
Xavier Deguillard
d900874401 revisionstore: rename HistoryStore to HgIdHistoryStore
Summary:
Similarly to the DataStore trait, this makes it easier to understand that they
deal with a Mercurial type Key.

Reviewed By: quark-zju

Differential Revision: D20445621

fbshipit-source-id: a1143d5f5d6a2c8686d517a6ea3c25b07c0df072
2020-03-13 19:03:27 -07:00
Xavier Deguillard
2e4742cefc revisionstore: rename DataStore traits to HgIdDataStore
Summary: This makes it clear that these traits are dealing with Mercurial Key.

Reviewed By: quark-zju

Differential Revision: D20445626

fbshipit-source-id: d5acbf442e9407b973e95e40af69b5a61bff0a4d
2020-03-13 19:03:27 -07:00
Jun Wu
cf04fe3e1f thrift-types: recompile Thrift sources
Summary: The thrift compiler and sources are changed.

Reviewed By: xavierd

Differential Revision: D20445164

fbshipit-source-id: f20f16ae02a922042f366a9a80a3642577f60e57
2020-03-13 14:25:23 -07:00
Jun Wu
7a7f98f1b2 configparser: migrate from Bytes to Text
Summary:
Since configparser enforces utf-8 config files (because pest wants Rust strings),
let's migrate from Bytes to Text to remove extra encoding conversions.

Previously this was blocked by the lack of ref-counted text (since the "source"
of each config location is the entire config file). Now minibytes provides Text
so we can use it.

This unfortunately requires dependent code to be updated. The pyconfigparser
interface is in theory wrong - it shouldn't return utf-8 bytes but
local-encoded bytes. I think it's cleaner to make pyconfigparser unaware of
HGENCODING, so I changed pyconfigparser to use unicode, and add compatibility
layer in uiconfig.py.

This also fixes non-ascii encoding issues on user name (especially on Windows).
The hgrc config file should be in utf-8 and the config parser returns explicit
unicode types, and Python code round-trip them with local encodings.

Reviewed By: markbt

Differential Revision: D20432938

fbshipit-source-id: b1359429b8f1c133ab2d6b2deea6048377dfeca1
2020-03-13 10:51:41 -07:00
Jun Wu
715bc5d451 configparser: migrate from bytes to minibytes
Summary:
This makes it easier to further migrate to `Text` interface.
Dependent crate (`auth`) is updated.

Reviewed By: markbt

Differential Revision: D20432941

fbshipit-source-id: 1dc29d52c9b17ce14676ef0555470c6d36a09c2b
2020-03-13 10:51:41 -07:00
Jun Wu
c4ec99ded4 minibytes: implement Text
Summary:
Text is a reference-counted shared String.
It's similar to Bytes but works for utf-8 strings.

The motivation is to replace configparser's use of Bytes to Text.

Reviewed By: markbt

Differential Revision: D20432940

fbshipit-source-id: ef990255d269e60d433c6520819f60ccdcbe488f
2020-03-13 10:51:41 -07:00
Jun Wu
7895e70dcf minibytes: make Bytes abstract
Summary: This makes it possible to implement "Text". See the next diff.

Reviewed By: markbt

Differential Revision: D20432943

fbshipit-source-id: 94b3810ab205c260d33f57bd637e4accc3ee871d
2020-03-13 10:51:40 -07:00
Jun Wu
e9b14b3608 minibytes: implement From<&'static {str,[u8]}>
Summary:
This makes the API easier to use.

Practically this makes it easier for configparser to migrate to minibytes.

Reviewed By: markbt

Differential Revision: D20432942

fbshipit-source-id: ad08eb118d2216054dc24c86b0b129ae82b9d17c
2020-03-13 10:51:40 -07:00
Jun Wu
ad8190713b cpython-ext: serialize Rust str into Python str type
Summary:
Previously Rust str was serialized into bytes. To be Python 3 friendly, let's
serialize it into `str`.

Reviewed By: markbt

Differential Revision: D19797706

fbshipit-source-id: 388eb044dc7e25cdc438f0c3d6fa5a5740f22e3d
2020-03-12 12:19:38 -07:00
Jun Wu
3376363721 tracing-collector: add is_event to TreeSpan
Summary: Expose the is_event property via public APIs.

Reviewed By: DurhamG

Differential Revision: D19797705

fbshipit-source-id: f441825e98208964f7b3d6815a177b464430cbb7
2020-03-12 12:19:38 -07:00
Stanislau Hlebik
ba871d3bdc xdiff: allow rendering diff for large files
Summary:
The goal of the stack is to support "rendering" diffs for large files in scs
server. Note that rendering is in quotes - we are fine with just showing a
placeholder like "Binary file ... differs". This is still better than the
current behaviour which just return an error.

In order to do that I suggest to tweak xdiff library to accept FileContentType
which can be either Normal(...) meaning that we have file content available, or
Omitted, which usually means the file is large and we don't even want to fetch it, and we
just want xdiff to generate a placeholder.

Reviewed By: markbt, krallin

Differential Revision: D20389226

fbshipit-source-id: 0b776d4f143e2ac657d664aa9911f6de8ccfea37
2020-03-12 04:27:23 -07:00
Jun Wu
194b38385a nameset: add a way to convert between NameSet and SpanSet
Summary:
This will be used in the Python world for legacy reasons. It shouldn't be used
in new Rust node.

To use it, the name `LegacyCodeNeedIdAccess` has to be used so we can do a code
search to find all users of it.

Reviewed By: sfilipco

Differential Revision: D20367834

fbshipit-source-id: 9b93a29f1461ce24bba6f31a2bbb1f327e216c6d
2020-03-11 20:37:30 -07:00
Jun Wu
eef56d9c5b namedag: add a sort API
Summary: This will be useful to actually sort commits.

Reviewed By: sfilipco

Differential Revision: D20367835

fbshipit-source-id: 43bc7835277af3a14ef323ce34247e0c03878dc8
2020-03-11 20:37:29 -07:00
Jun Wu
2ecc0bb757 namedag: move "all" concept to DagSet
Summary:
The old "AllSet" implementation is not very practical - it does not support
iteration. Practically, the "all()" set comes from the DAG. Change the "all"
concept to a hint similar to "is_topo_sorted", and update the fast path
(intersection) accordingly.

Reviewed By: sfilipco

Differential Revision: D20367837

fbshipit-source-id: fdbf370897c93058bfcab0571c1f6fa4b99b0f6b
2020-03-11 20:37:29 -07:00
Jun Wu
ef1696b4db namedag: rename arc_map to snapshot_map
Summary: The word "snapshot" more accurately describes its purpose.

Reviewed By: sfilipco

Differential Revision: D20367836

fbshipit-source-id: c91a0bd402fa1718b5d805beedc0e062824c53d3
2020-03-11 20:37:29 -07:00
Jun Wu
c5c75c9f59 fsinfo: autocorrect "" to "."
Summary:
Without this:

  In [3]: util.getfstype('')
  IOError: [Errno 2] No such file or directory (os error 2)

And there is a code path hitting this:

  File "edenscm/mercurial/util.py", line 1483, in checknlink
    fstype = getfstype(os.path.dirname(testfile))
		# testfile = '.'
	  # os.path.dirname(".") = ""

The old implementation works fine for an empty path:

	In [2]: m.util.getfstype('')
  Out[2]: 'eden'

So let's make the new Rust implementation consistent.

Reviewed By: xavierd

Differential Revision: D20313387

fbshipit-source-id: 258c424a3e8a796d983e20b0d4656e8e3f413706
2020-03-11 17:35:40 -07:00
Jun Wu
61bebcaacc fsinfo: try harder to get fuse fs type
Summary: Similar to D13982877. Try to get names like "fuse.ntfs".

Reviewed By: farnz

Differential Revision: D20313392

fbshipit-source-id: 8363d3d92843e6afb53a0003950be083034bd841
2020-03-11 17:35:39 -07:00
Jun Wu
13374f9d74 fsinfo: drop most type parameters
Summary:
Only keep type parameters at the top-level function.
This reduces the binary size and speeds up rustc.

Reviewed By: xavierd

Differential Revision: D20313388

fbshipit-source-id: 29d77731ff462fee1f1bb9f234601e3430198ae7
2020-03-11 17:35:39 -07:00
Jun Wu
c83006002c fsinfo: return unknown on unsupported platforms
Summary: This makes the code a bit more portable.

Reviewed By: xavierd

Differential Revision: D20313389

fbshipit-source-id: 080538939fa4d2d72e5905f23ad9be987d952748
2020-03-11 17:35:38 -07:00
Jun Wu
9cdc818915 fsinfo: drop "repo" from method names
Summary:
Rename the main method to "fstype". The API has no relation with repo.
So let's rename it.

Reviewed By: xavierd

Differential Revision: D20313386

fbshipit-source-id: 80dd1231ccccfe945150b117b151bce773f0dfeb
2020-03-11 17:35:38 -07:00
Jun Wu
951c8ab082 fsinfo: backport from telemetry
Summary: The fsinfo crate provides the "filesystem type" information.

Reviewed By: xavierd

Differential Revision: D20313391

fbshipit-source-id: f717f5edb32957d59d03090117cfdb8123f03933
2020-03-11 17:35:37 -07:00
Xavier Deguillard
f466037b4b revisionstore: fix memcache test flakiness
Summary:
Since the mocked memcache is shared between the tests, we need to make sure the
keys used by the tests are different, otherwise they are just caching each
others data.

Reviewed By: ikostia

Differential Revision: D20388783

fbshipit-source-id: 0f2f926e0ffe0e52e55291e46142808ce0921288
2020-03-11 15:58:03 -07:00
Jun Wu
97e9b81ba5 indexedlog: remove compiler warnings on Windows
Summary:
Some `use`s are not used on Windows. The code was also formatted using the
latest rustfmt.

Reviewed By: xavierd

Differential Revision: D20379704

fbshipit-source-id: ffadcd68e4e0440dcbd2a4e1ad8532b47a9d83e2
2020-03-11 15:54:19 -07:00
Xavier Deguillard
c98b9cfff9 revisionstore: remove Arc from MetadataStore
Summary: Similarly to the ContentStore, remove the Arc from MetadataStore.

Reviewed By: quark-zju

Differential Revision: D20376838

fbshipit-source-id: 4321600b752c919b6d9fa7bdee6f6cb7ae083b10
2020-03-11 13:39:06 -07:00
Xavier Deguillard
7e704ec7fb revisionstore: remove the Arc from ContentStore
Summary:
The clients should use an Rc/Arc if they need the ability to clone it. This
makes it more obvious and reduces the number of pointer indirection.

Reviewed By: quark-zju

Differential Revision: D20376839

fbshipit-source-id: c56e7e8f89ab17727be621894c329e344a7f3adb
2020-03-11 13:39:05 -07:00
Jun Wu
4960709aa3 dag: do not depend on types
Summary:
The dag crate is designed to work with any kind of binary commit hashes (ex. bonsai,
git or hg). The only use of `types` is to convert from binary to hex. Since dag
already has its own `to_hex` logic in `VertexName`. Let's use that instead.

Reviewed By: sfilipco

Differential Revision: D20378447

fbshipit-source-id: 00ecb551ea927fdb60dd91e5e645064f23139bcd
2020-03-11 10:49:31 -07:00
Jun Wu
009ea22175 indexedlog: retry rename in atomic_write on Windows
Summary:
Recently there are some Windows-related test flakiness in . All of them are
caused by `file.persist(path)` in `atomic_write_plain` failing with
"Access Denied". Since that can be caused by Windows Anti-Virus scans or other
weird stuff, let's workaround around it using automatically retires.

Process Explorer does not provide extra information:

    indexedlog-d0c6135fd7ed9ece.exe	5868	SetRenameInformationFile	C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\.tmpcfDsQQ	ACCESS DENIED	ReplaceIfExists: True, FileName: C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\meta

A successful rename looks like:

    indexedlog-d0c6135fd7ed9ece.exe	5868	SetRenameInformationFile	C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\.tmpbXEVw0	SUCCESS	ReplaceIfExists: True, FileName: C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\meta

Reviewed By: ikostia

Differential Revision: D20379618

fbshipit-source-id: db3e6be3d785875486f7a517df11cbf58bf65ddd
2020-03-11 10:06:47 -07:00
Xavier Deguillard
5d230aef68 backingstore: use get_file_content to strip metadata
Summary:
Now that the ContentStore can automatically strip the metadata header, no need
for duplicated code in the backingstore.

Reviewed By: fanzeyi

Differential Revision: D20376812

fbshipit-source-id: e863e1cc2fcdc8b9e612a464b305fa25ceb66e13
2020-03-11 09:40:26 -07:00
Xavier Deguillard
40bbe7b4da merge: add a Rust threaded file updater
Summary:
During `hg update`, Mercurial forks multiple processes to write files on disk
concurrently, this is done as fetching blobs from the content store, and
writing them to disk is CPU bound. Usually, threads would be the preferred way
of speeding up such process, but unfortunately, Python has GIL that severely
limit the available concurrency. So, multiple processes were chosen.

Unfortunately, the multi-process solution also brings a lot of other issues,
more recently, we've had cases where the connections to the server and memcache
had to be dropped after the fork. In some other cases, this caused deadlocks.
And the solution is not effective on Windows.

Now that Mercurial is getting more and more Rust, we could instead go back to
the threads solution by using them in Rust, and have Python just push work to
them, this is exactly what this change does.

Things that are left to be done, but I wanted to get a diff out first:
 - no file path audit
 - no file backup
 - no symlink creation
 - probably other things I'm missing

Reviewed By: quark-zju

Differential Revision: D20102888

fbshipit-source-id: d47829fd7818b97710586b9851880f178048e27b
2020-03-11 01:13:54 -07:00
Xavier Deguillard
185bc0f437 revisionstore: add an LfsMultiplexer store
Summary:
With this new store, blobs will be transparently written to either an LFS
store, or a non-LFS one, depending on their size.

Initially, and as long as getpackv2 is supported, we also need to support
parsing lfs pointer data that the server is sending and write these to the lfs
pointer store. This code is very adhoc and does manual parsing of the pointer
data, definitively not great, suggestion for a simple and better solution is
welcome :).

From a migration standpoint, the read-only LFS stores are added to the
ContentStore, this allows blobs written in it to be readable at all time even
when `remotefilelog.lfs` isn't set. The code will effecitvely be dormant for a
while until the option is turned on, if we need to disable it, the dormant code
will still be able to read all the blobs written to disk. This forces us to
deploy a release that contains this code to stable first, before setting
`remotefilelog.lfs`.

Reviewed By: quark-zju

Differential Revision: D19986878

fbshipit-source-id: 260f5a542d52e748c0c703bfa7bb8ffac0e7b388
2020-03-10 18:14:54 -07:00
Jun Wu
5f84fc8222 indexedlog: use dev-logger
Summary: This makes `RUST_LOG` work for indexedlog tests.

Reviewed By: xavierd

Differential Revision: D20286515

fbshipit-source-id: ff4a1476eb01a9067dabe3622fd598f65fe86a18
2020-03-10 14:16:39 -07:00
Jun Wu
7a12c33163 dev-logger: a simple library to enable env_logger for testing
Summary:
The tracing / env_logger integration works for hg as a binary. However I'd also
like to use it in library tests. This crate makes it easier to do so.

Reviewed By: xavierd

Differential Revision: D20286507

fbshipit-source-id: f5bf3288ce950591ddfe64b524ad51ce21ee4099
2020-03-10 14:16:38 -07:00
Jun Wu
cf72dc45f5 indexedlog: add some tracing information
Summary: Those has helped me debugging some issues.

Reviewed By: xavierd

Differential Revision: D20286513

fbshipit-source-id: 012ddb16c2d0efd8f8697a5ecd4564ea31d65630
2020-03-10 14:16:38 -07:00
Jun Wu
e7ed737a64 hgcommands: make env_logger show exit code
Summary: Move the scope of spans so the exit code is shown.

Reviewed By: xavierd

Differential Revision: D20286516

fbshipit-source-id: f39cbf60c86ea19a1bb0a09958748f04ff6a42e8
2020-03-10 14:16:37 -07:00
Jun Wu
bb9023c2cb hgcommands: move env_logger initialization to hgcommands
Summary:
Previously env_logger is only initialized if Python is initialized.
This diff makes env_logger initialized for Rust native commands.

Reviewed By: xavierd

Differential Revision: D20286517

fbshipit-source-id: 18fee96c2b41db1da9648d615d1e18809de90a63
2020-03-10 14:16:37 -07:00
Jun Wu
97d0a976fd tracing: make it write to the log eco-system
Summary:
This means crates like env_logger (which reads $RUST_LOG, and writes to stderr)
can be used for convenient debugging.

Reviewed By: xavierd

Differential Revision: D20286514

fbshipit-source-id: e3b80cc4830ba5cc6dbf7aa1cbb92a4f4f046a54
2020-03-10 14:16:37 -07:00
Jun Wu
796f199130 tracing: save static metadata from tracing to Spans
Summary:
Those metadata include module_path, target, line number, etc, in Rust native
format.  They will be used for the upcoming `log` integration.

Reviewed By: xavierd

Differential Revision: D20286510

fbshipit-source-id: 27019b941bef08c0bb3e505bbdae642282dcb141
2020-03-10 14:16:36 -07:00
Stefan Filip
d8b4ddcecf dag: split lock file acquisition to own function
Summary:
Spliting lock file acquisition from `IdDag::prepare_filesystem_sync` to its own
function.
Useful when looking ahead to split IdDag from IndexedLog.

Reviewed By: quark-zju

Differential Revision: D20316443

fbshipit-source-id: a0fd43439730376920706bb4349ce497f6624335
2020-03-09 10:18:07 -07:00
Stefan Filip
620cdd96f2 dag: add IdDag::iter_segments_with_parent
Summary:
This removes an inline use of the indexedlog indexes.
This is going to be useful when we try to separate IndexedLog specifics from
IdDag functionality.

Reviewed By: quark-zju

Differential Revision: D20316058

fbshipit-source-id: 942a0a71660bb327376c81fd3ac435d002ecca6e
2020-03-09 10:18:07 -07:00
Kuba Zika
6a25dbee81 Simplify error pattern matching
Summary:
Instead of returning `anyhow::Error` wrapping an `ErrorKind` enum
from each Thrift client method, just return an error type specific
to that method. This will make error handling simpler and less
error-prone by removing the need to downcast the returned error.

This diff also removes the `ErrorKind` enums so that we can be sure
that there are no leftover places trying to downcast to them.

(Note: this ignores all push blocking failures!)

Reviewed By: dtolnay

Differential Revision: D20260398

fbshipit-source-id: f0dd96a7b83dd49f6b30948660456539012f82e6
2020-03-06 12:09:38 -08:00
Jun Wu
3103fcf62b indexedlog: reload content after obtaining a lock at open time
Summary:
The old code does "read, lock, write", which is unsound because after "lock"
the data just read can be outdated and needs a reload.

Reviewed By: xavierd

Differential Revision: D20306137

fbshipit-source-id: a1c29d5078b2d47ee95cf00db8c1fcbe3447cccf
2020-03-06 08:12:02 -08:00
Jun Wu
75e4ffc17f indexedlog: change IndexDef.lag_threshold back from entries to bytes
Summary:
I thought the index function could be the bottleneck. However, the Log reading
(xxhash, decoding vlqs) can be much slower for very long entries. Therefore
using bytes as the lag threshold is better. It does leaked the Log
implementation details (how it encodes an entry) to some extend, though.

Reverts D20042045 and D20043116 logically. The lagging calculation is using
the new Index::get_original_meta API, which is easier to verify correctness
(In fact, it seems the old code is wrong - it might skip Index flushes if
sync() is called multiple times without flushing).

This should mitigate an issue where a huge entry (generated by `hg trace`) in
blackbox does not get indexed in time and cause performance regressions.

Reviewed By: DurhamG

Differential Revision: D20286508

fbshipit-source-id: 7cd694b58b95537490047fb1834c16b30d102f18
2020-03-05 13:29:48 -08:00
Jun Wu
efff6f3592 indexedlog: add an API to get the Index meta that is not dirty
Summary: This will be used to more reliably detect index lags.

Reviewed By: DurhamG

Differential Revision: D20286518

fbshipit-source-id: c553b6587363a55603b75df12580588e3100e35f
2020-03-05 13:29:47 -08:00
Jun Wu
66e60bacb9 rotatelog: build indexes for older logs on access
Summary:
This ensures indexes are complete even if index format or definition has been
changed.

Reviewed By: DurhamG

Differential Revision: D20286509

fbshipit-source-id: fcc4ebc616a4501e4b6fd2f1a9826f54f40b99b8
2020-03-05 13:29:47 -08:00
Jun Wu
669c58bd56 blackbox: use RotateLog::iter_dirty()
Summary:
This avoids loading all blackbox logs when `init()` gets called multiple times
(for example, once in Rust and once in Python).

Reviewed By: DurhamG

Differential Revision: D20286511

fbshipit-source-id: ef985e454782b787feac90a6249651a882b6552e
2020-03-05 13:29:47 -08:00
Jun Wu
1c6310b9d6 rotatelog: add iter_dirty() API
Summary: This API has the benefit that it does not trigger loading older logs.

Reviewed By: DurhamG

Differential Revision: D20286512

fbshipit-source-id: 426421691ad1130cdbb2305612d76f18c9f8798c
2020-03-05 13:29:46 -08:00
Jun Wu
64ba669a51 nameset: add some tests for DagSet
Summary:
With the new crate-public interfaces and Debug implementations it's possible to
write tests for DagSet. So let's do it.

Reviewed By: sfilipco

Differential Revision: D20242561

fbshipit-source-id: 180e04d9535f79471c79c4307f6ab6e8e8815067
2020-03-05 11:46:18 -08:00
Xavier Deguillard
34bce8690f revisionstore: silence compiler warning
Summary:
Don't restrict constructing a c_api datapack store to only Unix, we can
construct it on Windows too by assuming that their path will be valid UTF-8.

Reviewed By: quark-zju

Differential Revision: D20250718

fbshipit-source-id: 07234b6a71b50c803cfe3b962fa727f57037c919
2020-03-05 09:35:57 -08:00
Xavier Deguillard
751fc53638 types: add an ancestors method to RepoPath
Summary: This returns the ancestors in the reverser order as the parents method.

Reviewed By: sfilipco

Differential Revision: D20265277

fbshipit-source-id: 83277cee3d8e9070fc56d20d4c1877e6782c22f7
2020-03-05 09:31:32 -08:00
Jun Wu
bb1562604a dag: make some test APIs public in crate
Summary: Those will be reused by nameset::DagSet.

Reviewed By: sfilipco

Differential Revision: D20242563

fbshipit-source-id: 944e9a04aeb15439256ecea64355b67e326e5c89
2020-03-04 17:33:25 -08:00
Jun Wu
b8e1477401 nameset: impl Debug for other sets
Summary:
This is useful for `assert_eq!(format!("{:?}", set), "...")` tests.

It will be eventually exposed to Python as `__repr__`, similar to Python's
smartsets.

Reviewed By: sfilipco

Differential Revision: D20242562

fbshipit-source-id: 5373bb180db7cafebf273ace7cf2cb80fbfb8038
2020-03-04 17:33:25 -08:00
Jun Wu
fa069204e3 nameset: impl Debug for StaticSet
Summary:
In the Python world all smartsets have some kind of "debug" information. Let's
do something similar in Rust.

Related code is updated so the test is more readable.

Reviewed By: sfilipco

Differential Revision: D20242564

fbshipit-source-id: 7439c93d82d5d037c7167818f4e1125c5a1e513e
2020-03-04 17:33:24 -08:00
Jun Wu
0ae5a59e9e indexedlog: fix metadata-only updates for Indexes
Summary:
Previously, `flush()` will skip writing the file if there are only metadata
changes. Fix it by detecting metadata changes.

This can potentially fix an issue that certain blackbox indexes are empty,
lagging and require scanning the whole log again and again. In that case,
the index itself is not changed (the root radix entry is not changed), but
only the metadata tracking how many bytes in Log the index covered
changed.

Reviewed By: sfilipco

Differential Revision: D20264627

fbshipit-source-id: 7ee48454a92b5786b847d8b1d738cc38183f7a32
2020-03-04 15:59:12 -08:00
Xavier Deguillard
314d1978ef clidispatch: silence warning on windows
Summary:
Using `if cfg!` instead of `#[cfg]` allows for the compiler to understand
that the arguments aren't unused, and silence the warnings.

Reviewed By: quark-zju

Differential Revision: D20242280

fbshipit-source-id: 332dfe17b3a80a1096d15c91c9fb6644bd10e0cd
2020-03-04 09:49:15 -08:00
Xavier Deguillard
7d9d38017c configparser: silence compiler warning
Summary:
Compiling it on Windows produced a bunch of warning due to
`hgrc_configset_load_path` not being compiled on it. Fixed it so it no longer
depends on Unix specific imports.

Reviewed By: quark-zju

Differential Revision: D20241102

fbshipit-source-id: 3002f961191fbb9bc51aa9ac1154d6d50bd7fe23
2020-03-04 09:49:14 -08:00
Xavier Deguillard
db76b7d52b procinfo: address compiler warning
Summary:
The `.into_iter()` for this object is being deprecated and won't compile in
the future, fix it now.

Reviewed By: quark-zju

Differential Revision: D20241103

fbshipit-source-id: fdee463ed81cd07a65f3cc4c70a96c88928b3b87
2020-03-04 09:49:14 -08:00
Xavier Deguillard
be7ae642ea commitcloudsubscriber: silence compiler warning
Summary:
While compiling on Windows, this file issues a bunch of warnings, use `if
cfg!` instead of `#[cfg]` to silence them. The behavior is the same, but the
later allows the compiler to recognize that some is not unused.

Reviewed By: quark-zju

Differential Revision: D20241104

fbshipit-source-id: 2cd7f171c7a2f7220cc73bea9be3359260de19b2
2020-03-04 09:49:14 -08:00
Jun Wu
49464342fd indexedlog: try to use symlink for atomic_write on unix
Summary:
The change is in theory not necessary. However it improves the reliability on
OS crashes a bit, and can potentially workaround some bugs in filesystems
(as we saw in production where the atomic-written files are empty and the
system didn't crash).

The idea is, the `symlink` syscall does the file creation and "content" writing
together, while there is no way to create a file and write specific content
in one syscall. Note that the C symlink call uses 0-terminated string, and
the Rust stdlib exports it as accepting `Path`. To be safe, we encode binary
or non-utf8 content using `hex`.

For downgrade safety, the write path does not use symlink by default unless
format.use-symlink-atomic-write is set to true. This makes downgrade possible:
the read path is rolled out first, then we can turn on and off the write path.

The indexedlog Rust unit tests and test-doctor.t are migrated to use the new
symlink code paths.

Reviewed By: DurhamG

Differential Revision: D20153864

fbshipit-source-id: c31bd4287a8d29575180fbcf7227d2b04c4c1252
2020-03-04 07:23:48 -08:00
Jun Wu
def12896db indexedlog: add a utility function to read files crated by atomic_write
Summary:
This makes it possible to implement atomic_write differently (ex. use a
symlink).

Reviewed By: DurhamG

Differential Revision: D20153865

fbshipit-source-id: 07fa78c2f2dac696668f477c75f65cf70950b73f
2020-03-04 07:23:47 -08:00
Jun Wu
7c5e47bab1 indexedlog: rename chunk_size_log to chunk_size_logarithm
Summary:
This makes it clear that `log` is a math concept, not an append-only file like
`Log`.

Reviewed By: DurhamG

Differential Revision: D20149376

fbshipit-source-id: 67d2e9584b15f48759ca9b6dfce4279a5b1365a0
2020-03-03 13:41:28 -08:00
David Tolnay
e988a88be9 rust: Rename futures_preview:: to futures::
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/

This codemod replaces *all* dependencies on `//common/rust/renamed:futures-preview` with `fbsource//third-party/rust:futures-preview` and their uses in Rust code from `futures_preview::` to `futures::`.

This does not introduce any collisions with `futures::` meaning 0.1 futures because D20168958 previously renamed all of those to `futures_old::` in crates that depend on *both* 0.1 and 0.3 futures.

Codemod performed by:

```
rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
    -x \
    buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:futures-preview, 1))" \
| xargs sed -i 's,\bfutures_preview::,futures::,'

rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| xargs sed -i 's,//common/rust/renamed:futures-preview,fbsource//third-party/rust:futures-preview,'
```

Reviewed By: k21

Differential Revision: D20213432

fbshipit-source-id: 07ee643d350c5817cda1f43684d55084f8ac68a6
2020-03-03 11:01:20 -08:00
Genevieve Helsel
528015f9fe allow more hg fastpath cases
Reviewed By: simpkins

Differential Revision: D20143888

fbshipit-source-id: 4b1a73159bde6835626ad1766b2cf9dcd2faf6c4
2020-03-02 07:43:39 -08:00
Jun Wu
c718a5dc19 pathmatcher: add a test about a bug in globset/aho-corasick
Summary:
Also patch aho-corasick to fix the issue.

The issue was introduced by [an optimization path](063ca0d253) added in aho-corasick 0.7 series (used by globset 0.4.3).
aho-corasick 0.6.x (globset 0.4.2) are not affected.

The next aho-corasick release (0.7.9) contains the fix.

See https://github.com/BurntSushi/aho-corasick/issues/53 for more context.

Reported by: yns88

Reviewed By: DurhamG

Differential Revision: D20125697

fbshipit-source-id: 592375b43d7ee494bb3e916a1cb11c18f9ebe425
2020-02-28 22:09:28 -08:00
Jun Wu
10bb5a144e revset: replace some repo.revs with repo.nodes
Summary:
Migrate away from some uses of revision numbers.
Some dead code in discovery.py is removed.

I also fixed some test issues when I run tests locally.

Reviewed By: sfilipco

Differential Revision: D20155399

fbshipit-source-id: bfdcb57f06374f9f27be51b0980652ef50a2c8e0
2020-02-28 17:45:26 -08:00
Jun Wu
0220b4a0c3 nameset: make NameIter Send
Summary: This makes it possible to use NameIter in py_class.

Reviewed By: sfilipco

Differential Revision: D20020529

fbshipit-source-id: b9147b7dccb38d18d8361b420507fcbe97e01351
2020-02-28 16:35:25 -08:00
Jun Wu
782f2017aa dag: add hex prefix lookup
Summary: This will be used by commit hash prefix lookup.

Reviewed By: sfilipco

Differential Revision: D20020523

fbshipit-source-id: f2905ddf63098704b08dad8eb48272c3ffba7e25
2020-02-28 16:35:24 -08:00
Jun Wu
12441f48bf dag: re-export common types at top-level
Summary: Export common types at the top-level of the crate so it's easier to use.

Reviewed By: sfilipco

Differential Revision: D20020526

fbshipit-source-id: e9a0a8bc3cc91f81d0bc74e7530cd4613fc1dd61
2020-02-28 16:35:23 -08:00
Jun Wu
bc9f72ccf3 dag: implement DAG algorithms on NameDag
Summary: Those just delegate to IdDag for the actual calculation.

Reviewed By: sfilipco

Differential Revision: D20020522

fbshipit-source-id: 272828c520097c993ab50dac6ecc94dc370c8e8b
2020-02-28 16:35:23 -08:00
Jun Wu
b88da34fb0 dag: expose NameDag in tests
Summary: This allows tests to check NameDag APIs.

Reviewed By: sfilipco

Differential Revision: D20020525

fbshipit-source-id: 4ee8e4bcbd0731512ba17068e827b8045fc5d522
2020-02-28 16:35:23 -08:00
Jun Wu
194cd25f4f dag: add Arc<IdMap> to NameDag
Summary: This will be used to produce NameSet.

Reviewed By: sfilipco

Differential Revision: D20020519

fbshipit-source-id: abf6d73f2b985b74560d6b5db2800ff25450f02e
2020-02-28 16:35:22 -08:00
Jun Wu
7a343271b9 dag: rename NameDag::parents to NameDag::parent_names
Summary: This matches IdDag::parents (taking a set) and IdDag::parent_ids.

Reviewed By: sfilipco

Differential Revision: D20020524

fbshipit-source-id: 6e90727c355a7400f9a23e0b25e3392bdc032f49
2020-02-28 16:35:22 -08:00
Jun Wu
e3b28a683c nameset: add fast paths for DagSet
Summary: DagSet's SpanSet has fast paths for set operations. Use them.

Reviewed By: sfilipco

Differential Revision: D19912104

fbshipit-source-id: 24b55aa14d03be2f1be59c923e0b8e79d6bcbe6d
2020-02-28 16:35:22 -08:00
Jun Wu
587b06efee nameset: AllSet
Summary: This is similar to hg's fullreposet. It'll be useful as a dummy "subset".

Reviewed By: sfilipco

Differential Revision: D19912108

fbshipit-source-id: 33a95bcb3cf5931a431a1201d1a1f3c627cec7a1
2020-02-28 16:35:21 -08:00
Jun Wu
d41c55a13b nameset: SortedSet
Summary: SortedSet is a wrapper to other sets that marks it as topologically sorted.

Reviewed By: sfilipco

Differential Revision: D19912111

fbshipit-source-id: 2637e8fd29b97f6db0c5bae3f0decd7ac382eeb1
2020-02-28 16:35:21 -08:00
Jun Wu
51bea7aff7 nameset: LazySet
Summary: Similar to Mercurial's smartset.generatorset.

Reviewed By: sfilipco

Differential Revision: D19912110

fbshipit-source-id: 7d940b8578ec7090282e2addb1fde871cddb2b25
2020-02-28 16:35:20 -08:00
Jun Wu
5e451d07b1 nameset: DagSet
Summary:
Wraps SpanSet + IdMap so it only exposes commit names without ids.
There is no equivalent smartset in Mercurial.

Reviewed By: sfilipco

Differential Revision: D19912112

fbshipit-source-id: 0d257de11527dfa8836065ac94f652730a97a468
2020-02-28 16:35:20 -08:00
Jun Wu
e7e7a5b356 nameset: StaticSet
Summary: Similar to Mercurial's smartset.baseset. All names are statically known.

Reviewed By: sfilipco

Differential Revision: D19912105

fbshipit-source-id: e4fcf2d59291adb3ca01b3b90f1ac32c65ad7eaa
2020-02-28 16:35:20 -08:00
Jun Wu
349d1bc33e nameset: IntersectionSet
Summary: Similar to Mercurial's smartset.filterset.

Reviewed By: sfilipco

Differential Revision: D19912113

fbshipit-source-id: 7cf2101b2eb7ba34b542199293cdbfd3973ef72f
2020-02-28 16:35:19 -08:00
Jun Wu
c0a1a3ab22 nameset: DifferenceSet
Summary: Similar to Mercurial's smartset.filterset.

Reviewed By: sfilipco

Differential Revision: D19912107

fbshipit-source-id: a3187c94f8e0c64f6d92e924ba46e83ce74c3e19
2020-02-28 16:35:19 -08:00
Jun Wu
b6cea95ea5 dag: use Bytes to avoid some VertexName copies
Summary:
This is an example about how to use the new Bytes type. The performance change
is not obviously visible in benchmarks since the bottleneck is not at the bytes
copying.

Reviewed By: DurhamG

Differential Revision: D19818720

fbshipit-source-id: a431ae206cfa4fa08b2e162a48b3d7cbcd900f7f
2020-02-28 09:23:59 -08:00
Jun Wu
76ab726056 dag: switch from bytes to minibytes
Summary: The APIs are compatible so the switch is straightforward.

Reviewed By: DurhamG

Differential Revision: D19818713

fbshipit-source-id: 504e9149567c90eb661804e0dad20580a401aa76
2020-02-28 09:23:59 -08:00
Jun Wu
9e3920ca1c dag: fix benchmarks
Summary: D19559127 forgot those files.

Reviewed By: DurhamG

Differential Revision: D19818715

fbshipit-source-id: 92321492eae89ed9f748800b3bfcc306a54aab20
2020-02-28 09:23:59 -08:00
Jun Wu
c417232b1b mutationstore: update lag_threshold
Summary:
D20042045 changes the meaning of "lag_threshold". Update the value in mutation
store accordingly.

Reviewed By: DurhamG

Differential Revision: D20043116

fbshipit-source-id: 154e6dc2aa88ab0a9a9b21929ae5fa6163dcd403
2020-02-28 09:23:59 -08:00
Jun Wu
1962fd5f5b indexedlog: update lagging indexes at open time
Summary:
Previously indexes are only updated at `sync()` time. This diff makes it so
`open()` can also update lagging indexes. This should make index migration
(ex. D19851355) smoother - indexes are built in time and users suffer less from
the absent of indexes.

Reviewed By: DurhamG

Differential Revision: D20042046

fbshipit-source-id: 20412661a0ca4f5f67b671137c47b6373a42981d
2020-02-28 09:23:58 -08:00
Jun Wu
6da3bdadd2 indexedlog: extract logic writing indexes to disk to a method
Summary: The logic is currently only used by `sync()`. I'd like to reuse it at `open()`.

Reviewed By: DurhamG

Differential Revision: D20042044

fbshipit-source-id: 5c9734ff68bdcf8f8c8710c6a821b18d3afeaca0
2020-02-28 09:23:58 -08:00
Jun Wu
afb24f8a8a indexedlog: change IndexDef.lag_threshold from bytes to entries
Summary:
This is more friendly for indexedlog users - deciding lag_threshold by number
of entries is easier than by bytes.

Initially, I thought checking `bytes` is cheaper and checking `entries` is more
expensive. However, practically we will have to build indexes for `entires`
anyway. So we do know the number of entries lagging behind.

Reviewed By: DurhamG

Differential Revision: D20042045

fbshipit-source-id: 73042e406bd8b262d5ef9875e45a3fd5f29f78cf
2020-02-28 09:23:58 -08:00
Jun Wu
55363a78a7 indexedlog: add API to convert &[u8] to zero-copy Bytes
Summary:
This can be useful for users of indexedlog when they want `Bytes` (to get rid
of the lifetime parameter).

This might be useful for storage layer that wants to take the ownership of the
returned bytes.

Reviewed By: xavierd

Differential Revision: D19818714

fbshipit-source-id: cb2d4e7deff921915e07454fee15cb94a3d5c00d
2020-02-28 09:23:57 -08:00
Jun Wu
556850e715 indexedlog: remove unused mmap utility functions
Summary: Those utilities are no longer necessary since the new code uses Bytes.

Reviewed By: xavierd

Differential Revision: D19818717

fbshipit-source-id: 0b43af0f1eae1a4288e84d4170db058b27f80334
2020-02-28 09:23:57 -08:00
Jun Wu
aaf59c569d indexedlog: replace Mmap with Bytes in Log
Summary: This simplifies the code a bit and makes it cheaper to clone the Log.

Reviewed By: xavierd

Differential Revision: D19818716

fbshipit-source-id: bbf07b8b36009d53b63d8066ec422fc3c3796840
2020-02-28 09:23:57 -08:00
Jun Wu
90ee3cb05a indexedlog: remove ChecksumTable
Summary: It's no longer used since Index now has inlined its checksum logic.

Reviewed By: ikostia

Differential Revision: D19850744

fbshipit-source-id: eb134e4c1613573a2d238710b44ad8119c80a5ee
2020-02-28 09:23:56 -08:00
Jun Wu
a1601bfdd9 indexedlog: bump index filename
Summary:
Change index filename and metadata name. This makes sure the new format and old
format are separate so upgrading or downgrading won't have issues.

Reviewed By: DurhamG

Differential Revision: D19851355

fbshipit-source-id: 25dee018073a90040f5818b32b753a3f589c10e0
2020-02-28 09:23:56 -08:00
Jun Wu
6f4bf325d5 indexedlog: write Checksum inline with Log
Summary:
Enhance the index format: The Root entry can be followed by an optional
Checksum entry which replaces the need of ChecksumTable.

The format is backwards compatible since the old format will be just
treated as "there is no ChecksumTable", and the ChecksumTable will be built on
the next "flush".

This change is non-trivial. But the tests are pretty strong - the bitflip test
alone covered a lot of issues, and the dump of Index content helps a lot too.

For the index itself without ".sum", checksum, this change is bi-directional
compatible:
1. New code reading old file will just think the old file does not have the
   checksum entry, similar to new code having checksum disabled.
2. Old code will think the root+checksum slice is the "root" entry. Parsing
   the root entry is fine since it does not complain about unknown data at the
   end.

However, this change dropped the logic updating ".sum" files. That part is an
issue blocking old clients from reading new data.

Reviewed By: DurhamG

Differential Revision: D19850741

fbshipit-source-id: 551a45cd5422f1fb4c5b08e3b207a2ffe3d93dea
2020-02-28 09:23:55 -08:00
Jun Wu
b9e3046a8d indexedlog: add Checksum entry to Index
Summary:
To solve the soundness issue of ChecksumTable raised by the last diff.
I plan to move Checksum logic to Index. This has multiple benefits:
- Solve the soundness issue of ChecksumTable.
- Indexedlog no longer writes the ".sum" files. `atomic_write` can be quite
  slow (tens of milliseconds) on Windows. So this should help perf - with
  many indexes, it can save hundreds of milliseconds on Windows per
  indexedlog sync.

This diff adds the definition and serialization of the new Checksum entry.
The index format is not updated yet.

Reviewed By: markbt

Differential Revision: D19850742

fbshipit-source-id: df6e6ed12a12ef0d2a782dc9d6b4dc5dec3f4b46
2020-02-28 09:23:55 -08:00
Jun Wu
0f09413ed4 indexedlog: add a broken test showing checksum_table is racy
Summary:
With the last change, mmap cost is reduced, but ChecksumTable is unsound in a
corner case: the buffer to check is shorter than what ChecksumTable covers:

    checksum:  |----chunk----|----chunk----|----chunk--|
    buf:       |-------------------------------|       |
                                               ^       ^
                                        logic len     physical len

The checksum table will be unable to verify the last chunk, since it does not
have enough data in buf.

The issues is exposed by stress testing the multithread sync tests. It's not
always easy to reproduce, though.

Reviewed By: markbt

Differential Revision: D19850745

fbshipit-source-id: a1a96080163b7b9b56dcd6c1673d5d8d10e18a2b
2020-02-28 09:23:55 -08:00
Jun Wu
1e10527482 indexedlog: share Bytes between Index and ChecksumTable
Summary: This avoids some extra mmap syscalls by ChecksumTable.

Reviewed By: xavierd

Differential Revision: D19818721

fbshipit-source-id: dace55193f2b4b0f35e3868781faa2d2998d3b58
2020-02-28 09:23:54 -08:00
Jun Wu
1ece621c4d indexedlog: replace Mmap with Bytes in Index
Summary:
This simplifies the code a bit (no special cases about 0-sized mmap buffers)
and makes it cheaper to clone the index buffer (just an Arc::clone, without
another mmap syscall).

Reviewed By: xavierd

Differential Revision: D19818718

fbshipit-source-id: e96d42af74c7f0bb11703c5da31cdfbd5d76c372
2020-02-28 09:23:54 -08:00
Jun Wu
918672b106 tracing-collector: support owned strings in TreeSpans
Summary:
TreeSpans used to use `&str`, which adds a lifetime to the struct, making it
harder to be used in the Python land. Use a type parameter so TreeSpans<String>
can be used.

Reviewed By: DurhamG

Differential Revision: D19797708

fbshipit-source-id: c66429abfaf16d876151ca6f29da976bed91485d
2020-02-28 09:16:14 -08:00
Jun Wu
4cd7df6a01 tracing-collector: rename structs
Summary:
TreeSpan -> RawTreeSpan; TreeSpanWithMeta -> TreeSpanRef.

I'm going to add a non-reference version of TreeSpanRef.

Differential Revision: D19797701

fbshipit-source-id: 42b04c23d4d0ddbe821b94fa2ccb133ce9eafa05
2020-02-28 09:16:14 -08:00
Jun Wu
957617c8b8 tracing-collector: support filtering in TreeSpans
Summary:
The filtering interface allows callsite to select what they want. It's similar
to manifest walk with files or directory matchers in source control.

Reviewed By: DurhamG

Differential Revision: D19784467

fbshipit-source-id: 5cf6e4016d6fa1c90f8aeccc50809baccd4af5ab
2020-02-28 09:16:13 -08:00
Jun Wu
366e701239 tracing-collector: support Events in TreeSpans
Summary: The idea is that instants (events) can be a drop-in replacement for `ui.log`.

Reviewed By: DurhamG

Differential Revision: D19782897

fbshipit-source-id: 795bbba23d921e460f723f19ef529b203aea366a
2020-02-28 09:16:13 -08:00
Jun Wu
d205592d42 tracing-collector: extract logic finding parent span to a function
Summary: This function will be reused by the next diff.

Reviewed By: DurhamG

Differential Revision: D19782895

fbshipit-source-id: 1e636eabee9b0dffd287a1e6784a24ab2259f51f
2020-02-28 09:16:13 -08:00
Jun Wu
8b5fdc01fc tracing-collector: put treespans into a struct
Summary: This allows us to define methods on the treespans, such as filtering APIs.

Reviewed By: DurhamG

Differential Revision: D19782896

fbshipit-source-id: 2e7bd8344c0196e382728c26a8233abf944bbf29
2020-02-28 09:16:12 -08:00
David Tolnay
37a8401761 rust/thrift: Un-rename futures-preview dependency
Summary: The Thrift generated code depends only on futures 0.3, not 0.1. Thus it isn't necessary to depend on renamed:futures-preview and we can depend on futures-preview directly, which is exposed to Rust code as `futures::`.

Reviewed By: jsgf

Differential Revision: D20145921

fbshipit-source-id: 5cae94ec6747a374c2bf05f124ab237c798de005
2020-02-27 22:27:58 -08:00
Xavier Deguillard
6fac9ebad0 revisionstore: add a get_stripped method to ContentStore
Summary:
This new method returns the content of a blob without the copy-from metadata
header.

Reviewed By: DurhamG

Differential Revision: D20102889

fbshipit-source-id: e96f636b7d30460b59707a2cb700d667e616116a
2020-02-27 12:29:42 -08:00
Jun Wu
bce29c9562 nameset: UnionSet
Summary: Similar to Mercurial's smartset.addset.

Reviewed By: sfilipco

Differential Revision: D19912106

fbshipit-source-id: 0d0c8d0b71d2757259d26295eb4a564fea807dea
2020-02-27 07:34:57 -08:00