Commit Graph

487 Commits

Author SHA1 Message Date
Ellis Hoag
1d0d626a36 Pass config object down to repack
Summary:
Pass `configparser::config::ConfigSet` to `repack` in
`revisionstore/src/repack.rs` so that we can use various config values in `filter_incrementalpacks`.

* `repack.maxdatapacksize`, `repack.maxhistpacksize`
  * The overall max pack size
* `repack.sizelimit`
  * The size limit for any individual pack
* `repack.maxpacks`
  * The maximum number of packs we want to have after repack (overrides sizelimit)

Reviewed By: xavierd

Differential Revision: D21484836

fbshipit-source-id: 0407d50dfd69f23694fb736e729819b7285f480f
2020-05-11 16:41:30 -07:00
Xavier Deguillard
1cd0bba3fa revisionstore: enable use of proxies for LFS
Summary:
If http_proxy.no is set, we should respect it to avoid sending traffic to it
whenever required.

Reviewed By: wez

Differential Revision: D21383138

fbshipit-source-id: 4c8286aaaf51cbe19402bcf8e4ed03e0d167228b
2020-05-11 10:36:11 -07:00
Xavier Deguillard
2001c3fd69 revisionstore: add translate_lfs_missing to remote store get
Summary:
When Qing implemented all the get method, the translate_lfs_missing function
didn't exist, and I forgot to add them in the right places when landing the
diff that added it. Fix this.

Reviewed By: sfilipco

Differential Revision: D21418043

fbshipit-source-id: baf67b0fe60ed20aeb2c1acd50a209d04dc91c5e
2020-05-11 10:34:01 -07:00
Jun Wu
85a60dd9e4 renderdag: provide a method to render MemNameDag directly to a string
Summary: This would be handy to visualize a MemNameDag.

Reviewed By: sfilipco

Differential Revision: D21486522

fbshipit-source-id: c8d7147dc53a1a7c1b8b09ce055493c69cceba2f
2020-05-11 09:50:00 -07:00
Jun Wu
4352be72d3 renderdag: use MemNameDag to simplify tests
Summary:
Use MemNameDag::from_ascii to simplify the tests. This removes the need of:
- using tempdir
- converting between Id and VertexName manually via an IdMap
- depending on drawdag directly

Reviewed By: sfilipco

Differential Revision: D21486519

fbshipit-source-id: f04061d8892f043de40e7e321273acc51e15308a
2020-05-11 09:50:00 -07:00
Jun Wu
60684eb2c5 dag: make ASCII -> MemNameDag a public API
Summary:
It seems handy to construct a Dag just from ASCII. Therefore move it to a
public interface.

Reviewed By: sfilipco

Differential Revision: D21486525

fbshipit-source-id: de7f4b8dfcbcc486798928d4334c655431373276
2020-05-11 09:49:59 -07:00
Jun Wu
a6b7e965f3 dag: remove a TODO comment
Summary: It was done as NameSet.

Reviewed By: sfilipco

Differential Revision: D21479022

fbshipit-source-id: 1c32cabb27d72a6438409ede226104a9ebac6a1d
2020-05-11 09:49:59 -07:00
Jun Wu
4eb9251172 dag: move sort and parent_names to NameDagAlgorithm
Summary:
They are part of the read-only algorithms that are not specific to a certain
type of NameDag.

Reviewed By: sfilipco

Differential Revision: D21479017

fbshipit-source-id: 3fa58071ac43246d3cd45d84384ee93c7385f414
2020-05-11 09:49:59 -07:00
Jun Wu
282e034d30 dag: add MemNameDag
Summary:
Adds an in-memory NameDag so we can construct the DAG and use its algorithms by
just providing parents function and heads.

Reviewed By: sfilipco

Differential Revision: D21479021

fbshipit-source-id: e12d53a97afec77b2307d5efbb280bd506dee0ba
2020-05-11 09:49:58 -07:00
Jun Wu
5cbb99f4eb dag: add MemIdMap
Summary: Adds an in-memory IdMap to be used in an in-memory NameDag.

Reviewed By: sfilipco

Differential Revision: D21479018

fbshipit-source-id: bc702762b059e8659c6ab322f3c39f032e95d5b6
2020-05-11 09:49:58 -07:00
Jun Wu
682e8e96a7 dag: use IdMap traits in NameDag and NameSet
Summary:
This allows them to switch to a different IdMap implementation relatively
easily.

Reviewed By: sfilipco

Differential Revision: D21479023

fbshipit-source-id: 8ecb99cafe2093ec7d14b848ffa08581c5300414
2020-05-11 09:49:57 -07:00
Jun Wu
759f8b35c5 dag: move some IdMap operations to traits
Summary: This will allow different IdMap implementations.

Reviewed By: sfilipco

Differential Revision: D21479016

fbshipit-source-id: 852501896fddcb82624338acd9dceee41150e302
2020-05-11 09:49:57 -07:00
Jun Wu
30163eeb58 dag: update snapshot_map on change
Summary:
`NameDag::add_heads` API changes the internal `dag` state without updating
`snapshot_map`. That will cause queries relying on `snapshot_map` to fail.
Update it so that `snapshot_map` gets updated by `add_heads`.

Reviewed By: sfilipco

Differential Revision: D21479019

fbshipit-source-id: 70528aa4a488cef3dc71bf21dd89e45cfe763794
2020-05-11 09:49:57 -07:00
Jun Wu
f014f86b7a dag: move NameDag algorithms to a trait
Summary:
This makes it easier to add an "in-memory-only" NameDag with all the algorithms
implemented.

Reviewed By: sfilipco

Differential Revision: D21479020

fbshipit-source-id: c1a73e95f3291c273c800650f70db2a7eb0966d7
2020-05-11 09:49:56 -07:00
Xavier Deguillard
4e7303efd9 lfs: only upload when LFS blobs are present
Summary: If no LFS blobs needs uploading, then don't try to connect to the LFS server in the first place.

Reviewed By: DurhamG

Differential Revision: D21478243

fbshipit-source-id: 81fa960d899b14f47aadf2fc90485747889041e1
2020-05-08 13:24:21 -07:00
Meyer Jacobs
d49ac73f4c datastore: remove HgIdDataStore ::get_delta and ::get_delta_chain
Summary:
Remove HgIdDataStore::get_delta and all implementations. Remove HgIdDataStore::get_delta_chain from trait, remove all unnecessary implentations, remove all implementations from public Rust API. Leave Python API and introduce "delta-wrapping".

MutableDataPack::get_delta_chain must remain in some form, as it necessary to implement get using a sequence of Deltas. It has been moved to a private inherent impl.

DataPack::get_delta_chain must remain in some form for the same reasons, and in fact both implenetations can probably be merged, but it is also used in repack.rs for the free function repack_datapack. There are a few ways to address this without making DataPack::get_delta_chain part of the public API. I've currently chosen to make the method pub(crate), ie visible only within the revisionstore crate. Alternatively, we could move the repack_datapack function to a method on DataPack, or use a trait in a private module, or some other technique to restrict visibility to only where necessary.

UnionDataStore::get has been modified to call get on it's sub-stores and return the first which matches the given key.

MultiplexDeltaStore has been modified to implement get similarly to UnionDataStore.

Reviewed By: xavierd

Differential Revision: D21356420

fbshipit-source-id: d04e18a0781374a138395d1c21c3687897223d15
2020-05-07 11:04:01 -07:00
Mark Thomas
052e7c3877 check-code: convert to Python 3
Summary:
Update `contrib/check-code.py` to Python 3.

Mostly it was already compatible, however stricter regular expression parsing
revealed a case where one of our tests wasn't working, and as a result lots of
instances of `open(file).read()` existed that this test should have caught.

I have fixed up most of the instances in the code, although there are many
in the test suite that I have ignored for now.

Reviewed By: quark-zju

Differential Revision: D21427212

fbshipit-source-id: 7461a7c391e0ade947f779a2b476ca937fd24a8d
2020-05-07 09:07:50 -07:00
Durham Goode
939ff6c956 configs: move repo names to a enum
Summary:
A number of repo names are used quite frequently. Let's use an enum to
prevent typos and make things cleaner.

Reviewed By: quark-zju

Differential Revision: D21365036

fbshipit-source-id: 1d3d681443df181e9076f5ee87029ae61124a486
2020-05-06 09:03:17 -07:00
Mohan Zhang
2ef3e20e4d Run auto cargo locally
Summary: Since thirdpart depedency change on D21341319

Reviewed By: jsgf, wqfish

Differential Revision: D21417890

fbshipit-source-id: 3cc6bafa23512c7ae489513216bcafa46e7a744f
2020-05-05 20:59:02 -07:00
Zeyi (Rice) Fan
952069397c fix get blob local
Summary: This bug got in while iterating the original Diff. It should only be returning empty when the blob does not exist locally.

Reviewed By: xavierd

Differential Revision: D21417659

fbshipit-source-id: 676e22313ab4a024af5341d8c99797fc062bd293
2020-05-05 20:21:21 -07:00
Durham Goode
97d84e3b5d configs: move hgrc.dynamic to always be in the shared repo
Summary:
Instead of trying to maintain two hgrc.dynamic's for shared repositories,
let's just always use the one in the shared repo. In the long term we may be
able to get rid of the working-copy-specific hgrc entirely.

This does remove the ability to dynamically configure individual working copies.
That could be useful in cases where we have both eden and non-eden pointed at
the same repository, but I don't think we rely on this at the moment.

Reviewed By: quark-zju

Differential Revision: D21333564

fbshipit-source-id: c1fb86af183ec6dc5d973cf45d71419bda5514fb
2020-05-05 18:19:10 -07:00
Durham Goode
6e7f85b949 config: load .hg/hgrc.dynamic
Summary:
Adds .hg/hgrc.dynamic to the default load path, before .hg/hgrc though,
so it can be override.

Reviewed By: quark-zju

Differential Revision: D21310921

fbshipit-source-id: 288a2a2ba671943a9f8532489c29e819f9d891e1
2020-05-05 18:19:08 -07:00
Durham Goode
dc90e2ca04 configs: add domain, platform, and improve dynamicconfigs
Summary:
Adds dynamic config conditionals for domain, platform, and adds a few
useful helper functions.

Reviewed By: quark-zju

Differential Revision: D21310920

fbshipit-source-id: 58f35e52d6d7a4edae2b3aefff533ef2c021aa57
2020-05-05 18:19:08 -07:00
Durham Goode
8744b81151 git: update git dependency to 0.13.5 to match internal version
Summary:
Our internal git dependency got upgraded, so we need to upgrade our
Cargo.toml version.  Unfortunately this doesn't seem to have any test coverage?

Reviewed By: singhsrb

Differential Revision: D21410241

fbshipit-source-id: 64fe7f39a9c93aa5d97ce095ee1641c1cc6ed365
2020-05-05 15:35:12 -07:00
Zeyi (Rice) Fan
3baa8cc9b4 check if the blob fetching is present locally
Summary:
Talked with xavierd last week and we can use LocalStore's `get_missing` to determine if a blob is present locally. In this way we can prevent the backingstore crate from accidentally asking EdenAPI for a blob, so better control at EdenFS level.

With this change, we can use this function at the time where a blob import request is created with confidence that this should be short cheap call.

This diff should not change any behavior or performance.

Reviewed By: xavierd

Differential Revision: D21391959

fbshipit-source-id: fd31687da1e048262cb4eae2974cab6d8915a76d
2020-05-05 11:14:40 -07:00
Zeyi (Rice) Fan
da28f5a5b1 revisionstore: create directory with group share permission in correct places
Summary: When we create directory at certain places, we want these directories to be shared between different users on the same machine. This Diff uses the previously added `create_shared_dir` function to create these directories.

Reviewed By: xavierd

Differential Revision: D21322776

fbshipit-source-id: 5af01d0fc79c8d2bc5f946c105d74935ff92daf2
2020-05-04 19:21:33 -07:00
Jun Wu
73ff6559e6 zstore: add simple caching
Summary: Add simple caching so zstore can avoid some zstd calculation.

Reviewed By: DurhamG

Differential Revision: D21213076

fbshipit-source-id: 5e3152949cf4e6d6193c3ef3401f24e2efac5620
2020-05-01 14:24:52 -07:00
Durham Goode
3ac2be361a configs: move fbrules to a Facebook only part of the crate
Summary:
We'll be adding a bunch of Facebook specific configuration and values
here. Let's move it to someplace not open source.

Reviewed By: quark-zju

Differential Revision: D21241038

fbshipit-source-id: 2ac9cdce40b1b46f15f171d9d1f6b6692dcd29bf
2020-05-01 13:17:21 -07:00
Durham Goode
b1a2785a19 configparser: add ensure_location_supersets function
Summary:
Implements an ensure_location_supersets function who's goal is to
verify that a given config location specifies the exact same configs as a given
set of other locations. Any inconsistencies are removed from the config and
reported to the caller.

This will be used to ensure our dynamic configs match our existing rc file
configs exactly, before we delete the file configs.

Reviewed By: quark-zju

Differential Revision: D21240837

fbshipit-source-id: e2c8ec054a3696d2cf02e65c212ad886c5117253
2020-05-01 13:17:21 -07:00
Jason White
d5b2fb798e Fix outdated Cargo.toml files
Summary: `cargo autocargo` should normally produce no changes on `master`. The features of the `log` crate was updated in D21303891 without re-running autocargo. This fixes it.

Reviewed By: dtolnay

Differential Revision: D21349799

fbshipit-source-id: ce487bc5989e179673297350249593103b4d34dd
2020-05-01 10:29:33 -07:00
Zeyi (Rice) Fan
8830ed55df util: not try to create the directory when it already exists
Summary: Fix permission issues we are seeing with the latest Mercurial release.

Reviewed By: xavierd

Differential Revision: D21294499

fbshipit-source-id: bcfb13dd005258b2e3b74fa281dbd8df36133ef6
2020-04-28 20:33:59 -07:00
Carolyn Busch
4eeab3b81b Update cpython to 0.5
Summary:
D21270958 updated the cpython, python27-sys, and python3-sys crates to 0.5. Update
the Mercurial cargo dependencies to match.

Reviewed By: xavierd

Differential Revision: D21281875

fbshipit-source-id: ccad68749a25d11240351b5faeef27cb9c693456
2020-04-28 11:47:41 -07:00
Jun Wu
d479053954 metalog: support exporting to a git repo
Summary:
I wanted to figure out "who added this visible head", "what is the difference
between this metalog root and that root". Those are actually source control
operations (blame, diff). Add a git export feature so we can export metalog
to git to run those queries.

Choosing git here as we don't have native Rust utilities to create a more
efficient hg repo yet.

Ideally we can also make hg operate on a metalog directory as a "metalogrepo"
directly. However that seems to be quite difficult right now due to poor
abstractions.

Reviewed By: DurhamG

Differential Revision: D21213073

fbshipit-source-id: 4cc0331fbad6e1586907c0a66c18bcc25608ea49
2020-04-27 20:25:25 -07:00
Jun Wu
a0207c4542 metalog: expose root id API
Summary: This allows the Python world to obtain the root ID for logging purpose.

Reviewed By: DurhamG

Differential Revision: D21179513

fbshipit-source-id: 3f289c06d3d470ff492de39fa985203b3facbf00
2020-04-27 19:50:58 -07:00
Jun Wu
d8bdae0449 indexedlog: remove chown feature
Summary:
We removed the feature in D20704618 and it does not cause complaints.
Let's remove the code supporting the chown feature.

Reviewed By: DurhamG

Differential Revision: D21170307

fbshipit-source-id: c845016219e8c681930bb1780b94e6d31ca99730
2020-04-27 15:47:59 -07:00
Xavier Deguillard
86965b2f80 revisionstore: query store before fetching
Summary:
While the change looks fairly mechanical and simple, the why is a bit tricky.
If we follow the calls of `ContentStore::get`, we can see that it first goes
through every on-disk stores, and then switches to the remote ones, thanks to
that, when we reach the remote stores there is no reason to believe that the
local store attached to them contains the data we're fetching. Thus the
code used to always prefetch the data, before reading from the store what was
just written.

While this is true for regular stores (packstore, indexedlog, etc), it starts
to break down for the LFS store. The reason being that the LFS store is
internally represented as 2 halves: a pointer store, and a blob store.  It is
entirely possible that the LFS store contains a pointer, but not the actual
blob. In that case, the `get` executed on the LFS store will simply return
`Ok(None)` as the blob just isn't present, which will cause us to fallback to
the remote stores. Since we do have the pointer locally, we shouldn't try to
refetch it from the remote store, and thus why a `get_missing` needs to be run
before fetching from the remote store.

As I was writing this, I realized that all of this subtle behavior is basically
the same between all the stores, but unfortunately, doing a:
  impl<T: RemoteDataStore + ?Sized> HgIdDataStore for T
Conflicts with the one for `Deref<Target=HgIdDataStore>`. Macros could be used
to avoid code duplication, but for now let's not stray into them.

Reviewed By: DurhamG

Differential Revision: D21132667

fbshipit-source-id: 67a2544c36c2979dbac70dac5c1d055845509746
2020-04-27 12:53:11 -07:00
Qing Dong
b4edb7ff7f revisionstore: implement the get() functions on the various LocalDataStore interface
Summary: implement the get() functions on the various LocalDataStore interface implementations

Reviewed By: quark-zju

Differential Revision: D21220723

fbshipit-source-id: d69e805c40fb47db6970934e53a7cc8ac057b62b
2020-04-27 12:35:24 -07:00
Xavier Deguillard
39d49b694a revisionstore: remove memcache dependency on @mode/mac
Summary:
Memcache isn't available for Mac, but we can build the revisionstore with Buck
on macOS when building EdenFS. Let's only use Memcache for fbcode builds on
Linux for now.

Reviewed By: chadaustin

Differential Revision: D21235247

fbshipit-source-id: 5943ad84f6442e4dabbd2a44ae105457f5bb9d21
2020-04-24 17:29:36 -07:00
Zeyi (Rice) Fan
3374b99f28 util: ensure correct permission is set when creating directories
Summary:
When creates directories sometime we want to make sure other users within the same group have the write access to it to enable data sharing. Previously we rely on setting umask for the entire process to make sure the newly created directories have the correct permission bit. This is kind fragile and error-prone when running in a multi-thread environment.

This diff introduces an internal function `create_dir_with_mode` to create directory with specified permission mode. It first creates a temporary directory within the parent of the directory being created, setting up the correct permission bit, then attempts to rename the temporary directory to the desired name. This ensures that we never leave a directory without the correct permission in the place we need and without changing umask for the process.

Reviewed By: xavierd

Differential Revision: D21188903

fbshipit-source-id: 381bff7d3aaca097b9d50150e86cbbf70a90a0a5
2020-04-24 17:17:05 -07:00
Durham Goode
092d350800 filesystem: add treestate walking logic
Summary:
The second phase of pending changes is to iterate over the treestate
and figure out what files were not seen in the filesystem walk. This diff
implements that.

Reviewed By: xavierd

Differential Revision: D20546899

fbshipit-source-id: 3523fbc7e31ef0ed09c4937c72264b64e2a3db5b
2020-04-24 13:58:53 -07:00
Durham Goode
73a45b695b filesystem: add filesystem walking to PendingChanges
Summary:
The first phase of pending changes is inspecting the filesystem for
changes. This diff adds that logic.

Reviewed By: xavierd

Differential Revision: D20546909

fbshipit-source-id: 1c2c0fa7f700dbff4acfce4d5271b4472a13571f
2020-04-24 13:58:53 -07:00
Xavier Deguillard
19bfd35298 revisionstore: multiplex stores should return a path on flush
Summary:
On repack, when the Rust stores are in use, the repack code relies on
ContentStore::commit_pending to return the path of a newly created packfile, so
it won't delete it when going over the repacked ones. When LFS is enabled, both
the shared and the local stores are behind the LfsMultiplexer store that
unfortunately would always return `Ok(None)`. In this situation, the repack
code would delete all the repacked packfiles, which usually is the expected
behvior, unless only one packfile is being repacked, in which case the repack
code effectively re-creates the same packfile, and is then subsequently
deleted.

The solution is for the multiplex stores to properly return a path if one was
returned from the underlying stores.

Reviewed By: DurhamG

Differential Revision: D21211981

fbshipit-source-id: 74e4b9e5d2f5d9409ce732935552a02bdde85b93
2020-04-23 15:14:28 -07:00
Arun Kulshreshtha
6d3cacf9fd edenapi: add utility programs
Summary:
Add two utility programs for ad-hoc debugging of EdenAPI. EdenAPI requests and responses are encoded as CBOR, which is not easy to work with manually on the command line. In order to allow debugging the HTTP API using tools like `curl`, we need tools that can generate raw request payloads and interpret CBOR responses.

The utility programs included in this diff are:

- `make_req` - Can construct EdenAPI request payloads from a human-editable JSON representation.
- `data_util` - Can list, validate, and extract the contents of an EdenAPI data response.

These tools can be used by themselves or as part of a pipeline. See test plan for examples.

Reviewed By: xavierd

Differential Revision: D21136575

fbshipit-source-id: d1ac8d92964614005078a6ac76dd0835c29a80a5
2020-04-23 11:43:51 -07:00
Mark Thomas
c05efd8a5c mutationstore: move MutationEntry type to types crate
Summary: Move the MutationEntry type to the Mercurial types crate.  This will allow us to use it from Mononoke.

Reviewed By: quark-zju

Differential Revision: D20871338

fbshipit-source-id: 8de3bb8a2673673bc4c8a6cc7578a0a76358c14a
2020-04-23 08:58:10 -07:00
Durham Goode
dbff6c6b9a filesystem: add initial PendingChanges stubs
Summary:
The part of status that lists what files have changed is called
PendingChanges. This diff introduces the initial stub for PendingChanges. The
pending changes algorithm involves three parts:

1. Looking at files on the filesystem for changes.
2. Looking at files in the dirstate map for changes.
3. Looking at the content for any files that we were unsure of during steps 1
and 2.

This diff puts the basic state machine in place, and accepts the basic
information about the working copy (the root and what type of filesystem it is).
In the future we might have it detect what type of filesystem it is, but for now
this makes it easy.

Reviewed By: xavierd

Differential Revision: D20546898

fbshipit-source-id: a3030b7c846b3cb2fcba805b7fe4744df7c5764e
2020-04-22 19:55:50 -07:00
Durham Goode
701273d08f treestate: trim separators off get_filtered_keys inputs and outputs
Summary:
treestate.get_filtered_keys passes directory paths to the filter
function and returns directory matches with a trailing '/' on the end. This
makes it difficult to act as a path normalization function when the caller
doesn't know if the path is a file or directory.

It seems like we can just strip the trailing '/' before exposing the strings to
the caller (both as filter inputs and as get_filtered_keys outputs).

This is useful in the following diff that adds a case normalization crate.

Reviewed By: xavierd

Differential Revision: D20880881

fbshipit-source-id: 6e9f419178b4e278844244bd6aff2fc10e09d2cd
2020-04-22 19:55:50 -07:00
Durham Goode
e97d8d8895 vfs: move vfs logic into its own crate
Summary:
This logic will be used in a variety of places (update workers, status,
etc). Let's move it somewhere common.

Reviewed By: xavierd

Differential Revision: D20771623

fbshipit-source-id: b4de7c1d20055a10bbc1143d44c55ea1045ec62a
2020-04-22 19:55:49 -07:00
Durham Goode
a6e2b90c2e pathauditor: move into workingcopy crate
Summary:
PathAuditor will be needed for native status soon. Let's move it into
the workingcopy crate.

Reviewed By: xavierd

Differential Revision: D20546906

fbshipit-source-id: ef69f88ee828a72e82b5e944cc7913f391bd8a2f
2020-04-22 19:55:49 -07:00
Durham Goode
faced01356 tracing: add more trace values
Summary: This will help us debug slow commands

Reviewed By: xavierd

Differential Revision: D21075895

fbshipit-source-id: 3e7667bb0e4426d743841d8fda00fa4a315f0120
2020-04-22 15:35:17 -07:00
Xavier Deguillard
51438d13e7 revisionstore: write data to store when reading from memcache
Summary:
The Memcache store is voluntarily added to the ContentStore read store, first
as a regular store, and then as a remote one. The regular store is added to
enable the slower remote store to write to it so that blobs are uploaded to
Memcache as we read them from the network. The subtle part of this is that the
HgIdDataStore methods should not do anything, or the data fetched won't be
written to any on-disk store, forcing a refetch next time the blob is needed.

Reviewed By: DurhamG

Differential Revision: D21132669

fbshipit-source-id: 96e963c7bb4209add5a51a5fc48bc38f6bcd2cd9
2020-04-21 18:35:38 -07:00