Summary: The compiler is warning about it.
Reviewed By: singhsrb
Differential Revision: D21550266
fbshipit-source-id: 4e66b0dda0e443ed63aeccd888d38a8fcb5e4066
Summary:
Part of the mutation graph (excluding split and fold) can fit in the DAG
abstraction. Add a method to do that. This allows cross-dag calculations
like:
changelogdag = ... # suppose available by segmented changelog
# mutdag and changelogdag are independent (might have different nodes),
# with full DAG operations on either of them.
mutdag = mutation.getdag(...)
mutdag.heads(mutdag.descendants([node])) & changelogdag.descendants([node2]) # now possible
Comparing to the current situation, this has some advantages:
- No need to couple the "visibility", "filtered node" logic to the mutation
layer. The unknown nodes can be filtered out naturally by a set "&"
operation.
- DAG operations like heads, roots can be performed on mutdag when it's
previously impossible. We also get operations like visualization for free.
There are some limitations, though:
- The DAG cannot represent non 1:1 modifications (fold, split) losslessly.
Those relationships are simply ignored for now.
- The MemNameDag is not lazy. Reading a long chain of amends might be slow.
For most normal use-cases it is probably okay. If it becomes an issue we
can seek for other solutions, for example, store part of mutationstore
directly in a DAG format on disk, or have fast paths to bypass long
predecessor chain calculation.
Reviewed By: DurhamG
Differential Revision: D21486521
fbshipit-source-id: 03624c8e9803eb1852b3034b8f245555ec582e85
Summary: Add the ability to parse EdenAPI history responses to `data_util`.
Reviewed By: quark-zju
Differential Revision: D21489228
fbshipit-source-id: 42dda64273673431a6f3e4d7bd430689c76c387f
Summary: Change `make_req` to take a JSON array as input when constructing `DataRequest`s instead of a JSON object. This is more correct because DataRequests can include multiple `Key`s with the same path; this cannot be represented as an object since an object is effectively a hash map wherein we would have duplicate keys.
Reviewed By: quark-zju
Differential Revision: D21412989
fbshipit-source-id: 07a092a15372d86f3198bea2aa07b973b1a8449d
Summary:
Pass `configparser::config::ConfigSet` to `repack` in
`revisionstore/src/repack.rs` so that we can use various config values in `filter_incrementalpacks`.
* `repack.maxdatapacksize`, `repack.maxhistpacksize`
* The overall max pack size
* `repack.sizelimit`
* The size limit for any individual pack
* `repack.maxpacks`
* The maximum number of packs we want to have after repack (overrides sizelimit)
Reviewed By: xavierd
Differential Revision: D21484836
fbshipit-source-id: 0407d50dfd69f23694fb736e729819b7285f480f
Summary:
If http_proxy.no is set, we should respect it to avoid sending traffic to it
whenever required.
Reviewed By: wez
Differential Revision: D21383138
fbshipit-source-id: 4c8286aaaf51cbe19402bcf8e4ed03e0d167228b
Summary:
When Qing implemented all the get method, the translate_lfs_missing function
didn't exist, and I forgot to add them in the right places when landing the
diff that added it. Fix this.
Reviewed By: sfilipco
Differential Revision: D21418043
fbshipit-source-id: baf67b0fe60ed20aeb2c1acd50a209d04dc91c5e
Summary: This would be handy to visualize a MemNameDag.
Reviewed By: sfilipco
Differential Revision: D21486522
fbshipit-source-id: c8d7147dc53a1a7c1b8b09ce055493c69cceba2f
Summary:
Use MemNameDag::from_ascii to simplify the tests. This removes the need of:
- using tempdir
- converting between Id and VertexName manually via an IdMap
- depending on drawdag directly
Reviewed By: sfilipco
Differential Revision: D21486519
fbshipit-source-id: f04061d8892f043de40e7e321273acc51e15308a
Summary:
It seems handy to construct a Dag just from ASCII. Therefore move it to a
public interface.
Reviewed By: sfilipco
Differential Revision: D21486525
fbshipit-source-id: de7f4b8dfcbcc486798928d4334c655431373276
Summary:
They are part of the read-only algorithms that are not specific to a certain
type of NameDag.
Reviewed By: sfilipco
Differential Revision: D21479017
fbshipit-source-id: 3fa58071ac43246d3cd45d84384ee93c7385f414
Summary:
Adds an in-memory NameDag so we can construct the DAG and use its algorithms by
just providing parents function and heads.
Reviewed By: sfilipco
Differential Revision: D21479021
fbshipit-source-id: e12d53a97afec77b2307d5efbb280bd506dee0ba
Summary: Adds an in-memory IdMap to be used in an in-memory NameDag.
Reviewed By: sfilipco
Differential Revision: D21479018
fbshipit-source-id: bc702762b059e8659c6ab322f3c39f032e95d5b6
Summary:
This allows them to switch to a different IdMap implementation relatively
easily.
Reviewed By: sfilipco
Differential Revision: D21479023
fbshipit-source-id: 8ecb99cafe2093ec7d14b848ffa08581c5300414
Summary: This will allow different IdMap implementations.
Reviewed By: sfilipco
Differential Revision: D21479016
fbshipit-source-id: 852501896fddcb82624338acd9dceee41150e302
Summary:
`NameDag::add_heads` API changes the internal `dag` state without updating
`snapshot_map`. That will cause queries relying on `snapshot_map` to fail.
Update it so that `snapshot_map` gets updated by `add_heads`.
Reviewed By: sfilipco
Differential Revision: D21479019
fbshipit-source-id: 70528aa4a488cef3dc71bf21dd89e45cfe763794
Summary:
This makes it easier to add an "in-memory-only" NameDag with all the algorithms
implemented.
Reviewed By: sfilipco
Differential Revision: D21479020
fbshipit-source-id: c1a73e95f3291c273c800650f70db2a7eb0966d7
Summary: If no LFS blobs needs uploading, then don't try to connect to the LFS server in the first place.
Reviewed By: DurhamG
Differential Revision: D21478243
fbshipit-source-id: 81fa960d899b14f47aadf2fc90485747889041e1
Summary:
Remove HgIdDataStore::get_delta and all implementations. Remove HgIdDataStore::get_delta_chain from trait, remove all unnecessary implentations, remove all implementations from public Rust API. Leave Python API and introduce "delta-wrapping".
MutableDataPack::get_delta_chain must remain in some form, as it necessary to implement get using a sequence of Deltas. It has been moved to a private inherent impl.
DataPack::get_delta_chain must remain in some form for the same reasons, and in fact both implenetations can probably be merged, but it is also used in repack.rs for the free function repack_datapack. There are a few ways to address this without making DataPack::get_delta_chain part of the public API. I've currently chosen to make the method pub(crate), ie visible only within the revisionstore crate. Alternatively, we could move the repack_datapack function to a method on DataPack, or use a trait in a private module, or some other technique to restrict visibility to only where necessary.
UnionDataStore::get has been modified to call get on it's sub-stores and return the first which matches the given key.
MultiplexDeltaStore has been modified to implement get similarly to UnionDataStore.
Reviewed By: xavierd
Differential Revision: D21356420
fbshipit-source-id: d04e18a0781374a138395d1c21c3687897223d15
Summary:
Update `contrib/check-code.py` to Python 3.
Mostly it was already compatible, however stricter regular expression parsing
revealed a case where one of our tests wasn't working, and as a result lots of
instances of `open(file).read()` existed that this test should have caught.
I have fixed up most of the instances in the code, although there are many
in the test suite that I have ignored for now.
Reviewed By: quark-zju
Differential Revision: D21427212
fbshipit-source-id: 7461a7c391e0ade947f779a2b476ca937fd24a8d
Summary:
A number of repo names are used quite frequently. Let's use an enum to
prevent typos and make things cleaner.
Reviewed By: quark-zju
Differential Revision: D21365036
fbshipit-source-id: 1d3d681443df181e9076f5ee87029ae61124a486
Summary: This bug got in while iterating the original Diff. It should only be returning empty when the blob does not exist locally.
Reviewed By: xavierd
Differential Revision: D21417659
fbshipit-source-id: 676e22313ab4a024af5341d8c99797fc062bd293
Summary:
Instead of trying to maintain two hgrc.dynamic's for shared repositories,
let's just always use the one in the shared repo. In the long term we may be
able to get rid of the working-copy-specific hgrc entirely.
This does remove the ability to dynamically configure individual working copies.
That could be useful in cases where we have both eden and non-eden pointed at
the same repository, but I don't think we rely on this at the moment.
Reviewed By: quark-zju
Differential Revision: D21333564
fbshipit-source-id: c1fb86af183ec6dc5d973cf45d71419bda5514fb
Summary:
Adds .hg/hgrc.dynamic to the default load path, before .hg/hgrc though,
so it can be override.
Reviewed By: quark-zju
Differential Revision: D21310921
fbshipit-source-id: 288a2a2ba671943a9f8532489c29e819f9d891e1
Summary:
Our internal git dependency got upgraded, so we need to upgrade our
Cargo.toml version. Unfortunately this doesn't seem to have any test coverage?
Reviewed By: singhsrb
Differential Revision: D21410241
fbshipit-source-id: 64fe7f39a9c93aa5d97ce095ee1641c1cc6ed365
Summary:
Talked with xavierd last week and we can use LocalStore's `get_missing` to determine if a blob is present locally. In this way we can prevent the backingstore crate from accidentally asking EdenAPI for a blob, so better control at EdenFS level.
With this change, we can use this function at the time where a blob import request is created with confidence that this should be short cheap call.
This diff should not change any behavior or performance.
Reviewed By: xavierd
Differential Revision: D21391959
fbshipit-source-id: fd31687da1e048262cb4eae2974cab6d8915a76d
Summary: When we create directory at certain places, we want these directories to be shared between different users on the same machine. This Diff uses the previously added `create_shared_dir` function to create these directories.
Reviewed By: xavierd
Differential Revision: D21322776
fbshipit-source-id: 5af01d0fc79c8d2bc5f946c105d74935ff92daf2
Summary:
We'll be adding a bunch of Facebook specific configuration and values
here. Let's move it to someplace not open source.
Reviewed By: quark-zju
Differential Revision: D21241038
fbshipit-source-id: 2ac9cdce40b1b46f15f171d9d1f6b6692dcd29bf
Summary:
Implements an ensure_location_supersets function who's goal is to
verify that a given config location specifies the exact same configs as a given
set of other locations. Any inconsistencies are removed from the config and
reported to the caller.
This will be used to ensure our dynamic configs match our existing rc file
configs exactly, before we delete the file configs.
Reviewed By: quark-zju
Differential Revision: D21240837
fbshipit-source-id: e2c8ec054a3696d2cf02e65c212ad886c5117253
Summary: `cargo autocargo` should normally produce no changes on `master`. The features of the `log` crate was updated in D21303891 without re-running autocargo. This fixes it.
Reviewed By: dtolnay
Differential Revision: D21349799
fbshipit-source-id: ce487bc5989e179673297350249593103b4d34dd
Summary: Fix permission issues we are seeing with the latest Mercurial release.
Reviewed By: xavierd
Differential Revision: D21294499
fbshipit-source-id: bcfb13dd005258b2e3b74fa281dbd8df36133ef6
Summary:
I wanted to figure out "who added this visible head", "what is the difference
between this metalog root and that root". Those are actually source control
operations (blame, diff). Add a git export feature so we can export metalog
to git to run those queries.
Choosing git here as we don't have native Rust utilities to create a more
efficient hg repo yet.
Ideally we can also make hg operate on a metalog directory as a "metalogrepo"
directly. However that seems to be quite difficult right now due to poor
abstractions.
Reviewed By: DurhamG
Differential Revision: D21213073
fbshipit-source-id: 4cc0331fbad6e1586907c0a66c18bcc25608ea49
Summary: This allows the Python world to obtain the root ID for logging purpose.
Reviewed By: DurhamG
Differential Revision: D21179513
fbshipit-source-id: 3f289c06d3d470ff492de39fa985203b3facbf00
Summary:
We removed the feature in D20704618 and it does not cause complaints.
Let's remove the code supporting the chown feature.
Reviewed By: DurhamG
Differential Revision: D21170307
fbshipit-source-id: c845016219e8c681930bb1780b94e6d31ca99730
Summary:
While the change looks fairly mechanical and simple, the why is a bit tricky.
If we follow the calls of `ContentStore::get`, we can see that it first goes
through every on-disk stores, and then switches to the remote ones, thanks to
that, when we reach the remote stores there is no reason to believe that the
local store attached to them contains the data we're fetching. Thus the
code used to always prefetch the data, before reading from the store what was
just written.
While this is true for regular stores (packstore, indexedlog, etc), it starts
to break down for the LFS store. The reason being that the LFS store is
internally represented as 2 halves: a pointer store, and a blob store. It is
entirely possible that the LFS store contains a pointer, but not the actual
blob. In that case, the `get` executed on the LFS store will simply return
`Ok(None)` as the blob just isn't present, which will cause us to fallback to
the remote stores. Since we do have the pointer locally, we shouldn't try to
refetch it from the remote store, and thus why a `get_missing` needs to be run
before fetching from the remote store.
As I was writing this, I realized that all of this subtle behavior is basically
the same between all the stores, but unfortunately, doing a:
impl<T: RemoteDataStore + ?Sized> HgIdDataStore for T
Conflicts with the one for `Deref<Target=HgIdDataStore>`. Macros could be used
to avoid code duplication, but for now let's not stray into them.
Reviewed By: DurhamG
Differential Revision: D21132667
fbshipit-source-id: 67a2544c36c2979dbac70dac5c1d055845509746
Summary: implement the get() functions on the various LocalDataStore interface implementations
Reviewed By: quark-zju
Differential Revision: D21220723
fbshipit-source-id: d69e805c40fb47db6970934e53a7cc8ac057b62b
Summary:
Memcache isn't available for Mac, but we can build the revisionstore with Buck
on macOS when building EdenFS. Let's only use Memcache for fbcode builds on
Linux for now.
Reviewed By: chadaustin
Differential Revision: D21235247
fbshipit-source-id: 5943ad84f6442e4dabbd2a44ae105457f5bb9d21
Summary:
When creates directories sometime we want to make sure other users within the same group have the write access to it to enable data sharing. Previously we rely on setting umask for the entire process to make sure the newly created directories have the correct permission bit. This is kind fragile and error-prone when running in a multi-thread environment.
This diff introduces an internal function `create_dir_with_mode` to create directory with specified permission mode. It first creates a temporary directory within the parent of the directory being created, setting up the correct permission bit, then attempts to rename the temporary directory to the desired name. This ensures that we never leave a directory without the correct permission in the place we need and without changing umask for the process.
Reviewed By: xavierd
Differential Revision: D21188903
fbshipit-source-id: 381bff7d3aaca097b9d50150e86cbbf70a90a0a5
Summary:
The second phase of pending changes is to iterate over the treestate
and figure out what files were not seen in the filesystem walk. This diff
implements that.
Reviewed By: xavierd
Differential Revision: D20546899
fbshipit-source-id: 3523fbc7e31ef0ed09c4937c72264b64e2a3db5b
Summary:
The first phase of pending changes is inspecting the filesystem for
changes. This diff adds that logic.
Reviewed By: xavierd
Differential Revision: D20546909
fbshipit-source-id: 1c2c0fa7f700dbff4acfce4d5271b4472a13571f
Summary:
On repack, when the Rust stores are in use, the repack code relies on
ContentStore::commit_pending to return the path of a newly created packfile, so
it won't delete it when going over the repacked ones. When LFS is enabled, both
the shared and the local stores are behind the LfsMultiplexer store that
unfortunately would always return `Ok(None)`. In this situation, the repack
code would delete all the repacked packfiles, which usually is the expected
behvior, unless only one packfile is being repacked, in which case the repack
code effectively re-creates the same packfile, and is then subsequently
deleted.
The solution is for the multiplex stores to properly return a path if one was
returned from the underlying stores.
Reviewed By: DurhamG
Differential Revision: D21211981
fbshipit-source-id: 74e4b9e5d2f5d9409ce732935552a02bdde85b93
Summary:
Add two utility programs for ad-hoc debugging of EdenAPI. EdenAPI requests and responses are encoded as CBOR, which is not easy to work with manually on the command line. In order to allow debugging the HTTP API using tools like `curl`, we need tools that can generate raw request payloads and interpret CBOR responses.
The utility programs included in this diff are:
- `make_req` - Can construct EdenAPI request payloads from a human-editable JSON representation.
- `data_util` - Can list, validate, and extract the contents of an EdenAPI data response.
These tools can be used by themselves or as part of a pipeline. See test plan for examples.
Reviewed By: xavierd
Differential Revision: D21136575
fbshipit-source-id: d1ac8d92964614005078a6ac76dd0835c29a80a5
Summary: Move the MutationEntry type to the Mercurial types crate. This will allow us to use it from Mononoke.
Reviewed By: quark-zju
Differential Revision: D20871338
fbshipit-source-id: 8de3bb8a2673673bc4c8a6cc7578a0a76358c14a
Summary:
The part of status that lists what files have changed is called
PendingChanges. This diff introduces the initial stub for PendingChanges. The
pending changes algorithm involves three parts:
1. Looking at files on the filesystem for changes.
2. Looking at files in the dirstate map for changes.
3. Looking at the content for any files that we were unsure of during steps 1
and 2.
This diff puts the basic state machine in place, and accepts the basic
information about the working copy (the root and what type of filesystem it is).
In the future we might have it detect what type of filesystem it is, but for now
this makes it easy.
Reviewed By: xavierd
Differential Revision: D20546898
fbshipit-source-id: a3030b7c846b3cb2fcba805b7fe4744df7c5764e
Summary:
treestate.get_filtered_keys passes directory paths to the filter
function and returns directory matches with a trailing '/' on the end. This
makes it difficult to act as a path normalization function when the caller
doesn't know if the path is a file or directory.
It seems like we can just strip the trailing '/' before exposing the strings to
the caller (both as filter inputs and as get_filtered_keys outputs).
This is useful in the following diff that adds a case normalization crate.
Reviewed By: xavierd
Differential Revision: D20880881
fbshipit-source-id: 6e9f419178b4e278844244bd6aff2fc10e09d2cd
Summary:
This logic will be used in a variety of places (update workers, status,
etc). Let's move it somewhere common.
Reviewed By: xavierd
Differential Revision: D20771623
fbshipit-source-id: b4de7c1d20055a10bbc1143d44c55ea1045ec62a
Summary:
PathAuditor will be needed for native status soon. Let's move it into
the workingcopy crate.
Reviewed By: xavierd
Differential Revision: D20546906
fbshipit-source-id: ef69f88ee828a72e82b5e944cc7913f391bd8a2f
Summary: This will help us debug slow commands
Reviewed By: xavierd
Differential Revision: D21075895
fbshipit-source-id: 3e7667bb0e4426d743841d8fda00fa4a315f0120
Summary:
The Memcache store is voluntarily added to the ContentStore read store, first
as a regular store, and then as a remote one. The regular store is added to
enable the slower remote store to write to it so that blobs are uploaded to
Memcache as we read them from the network. The subtle part of this is that the
HgIdDataStore methods should not do anything, or the data fetched won't be
written to any on-disk store, forcing a refetch next time the blob is needed.
Reviewed By: DurhamG
Differential Revision: D21132669
fbshipit-source-id: 96e963c7bb4209add5a51a5fc48bc38f6bcd2cd9
Summary:
When comparing empty file with file with content our xdiff wrongly included
warning about missing newline, which also made the line counter in the hunk
header off-by-one.
Empty files are quite rare in our repos, that's why I discovered this bug only
now (it broke phabricator parsing of this single commit).
Reviewed By: markedson1024
Differential Revision: D21141341
fbshipit-source-id: 9d3e0d8a61ac4ee2cf27978b99b3a092259ee186
Summary:
Ideally, either the ContentStore, or the upper layer should verify that we
haven't missed uploading a blob, which could lead to weird behavior down the
line. For now, all the stores will return the keys of the blobs that weren't
uploaded, which allows us to return these keys to Python.
Reviewed By: DurhamG
Differential Revision: D21103998
fbshipit-source-id: 5bab0bbec32244291c65a07aa2a13aec344e715e
Summary:
We'll be adding more data to the filesystem layer, so let's move this
out of lib.rs.
Also made a slight tweak to expose File metadata in the walk results, which will be used by the future pending changes logic to avoid re-stating the file.
Reviewed By: xavierd
Differential Revision: D20546903
fbshipit-source-id: 70456055b0da601990e6d6ff535678d2df6c50ba
Summary:
This allows the streampager to be configured via hgrc files.
Default are picked so the behavior is closer to the current default pager
(`less -FRX`).
Reviewed By: DurhamG
Differential Revision: D20902034
fbshipit-source-id: 994ab963ceace02eeb1d18cfa5768e411ca3610b
Summary: This makes it work with chg, since `/dev/tty` is not available for chg.
Reviewed By: DurhamG
Differential Revision: D20936967
fbshipit-source-id: f3ded1aa5552f321ff7043a039f4e35a88160a51
Summary:
We want Mercurial to become more responsible for it's own
configuration, instead of relying on chef and other means. To do so, let's
introduce a new `hg debugdynamicconfig` that can generate dynamic configs for
a given repository based on various states, like what tier it's in or what shard
that machine is in. By default it generates to '.hg/hgrc.dynamic' for the given
repository.
Currently it just sets the hostgroup config.
Future diffs will make Mercurial consume this config, and possibly have Mercurial
call this command asynchronously when it notices the file is out-of-date.
Reviewed By: quark-zju
Differential Revision: D20828132
fbshipit-source-id: 6f5bf749f5b04e0a5989d6dc19ee788c2e47f88f
Summary:
A future diff will want to generate configs programmatically and write
them to a file. Let's add write support to ConfigSet.
Reviewed By: quark-zju
Differential Revision: D20828133
fbshipit-source-id: 702f6f9bdfdf99ef25c6e1c0ab33373a4b6508fe
Summary:
The revisionstore is a large crate with many dependencies, split out the types part which is most likely to be shared between different pieces of eden/mononoke infrastructure.
With this split it was easy to get eden/mononoke/mercurial/bundles
Reviewed By: farnz
Differential Revision: D20869220
fbshipit-source-id: e9ee4144e7f6250af44802e43221a5b6521d965d
Summary: Since the old Edenfs warning is usually for simply picking up new eden releases, we can suggest the user runs a graceful restart instead of a normal restart to avoid them running into `Transport not connected` errors. This path is only hit in unix environments, so windows users will not see this (since graceful restart isn't supported there yet). Since this is a manual step as well, it will be easier for a user to see if they run into an issue here. This can also enable us to get more telemetry from users running graceful restarts.
Reviewed By: wez
Differential Revision: D20901597
fbshipit-source-id: 9e5c9a90313901be159f66afcbbadc5d7af4fe28
Summary:
Sometimes the Rust io::Error is generated without an errno (ex. pipe
0.2 would generate BrokenPipe error without an errno). The Python
land uses errno to check error type (Python does not have io::ErrorKind).
Therefore attempt to translate ErrorKind to Python errno. Without this
exiting the rust pager early would crash like:
StdioError: [Errno None] pipe reader has been dropped
abort: pipe reader has been dropped
Reviewed By: markbt
Differential Revision: D20898559
fbshipit-source-id: ef863617e0e500d878ea0f9aeac06b4d87ffbcf2
Summary: This makes the tracing features easier to use.
Reviewed By: DurhamG
Differential Revision: D19797703
fbshipit-source-id: fb5cb17cd389575cf0134a708bcd9df3b90e9ab4
Summary:
On upload, read all the local blobs and upload them to the LFS server. This is
necessary to support `hg push` or `hg cloud sync` for local LFS blobs.
One of the change made here is to switch from having the batch method return an
Iterator to having them take a callback. This made it easier to write the gut
of the batch implementation in a more generic way.
A future change will also take care of moving local blobs to the shared store
after upload.
Reviewed By: DurhamG
Differential Revision: D20843136
fbshipit-source-id: 92d34a0971263829ff58e137e9905b527e18358d
Summary: This method will be used to upload local LFS blobs to the LFS server.
Reviewed By: DurhamG
Differential Revision: D20843137
fbshipit-source-id: 33a331c42687c47442189ee329da33cb5ce4d376
Summary:
Loose files makes it easier to interact with a Mercurial server for tests, use
it instead of an IndexedLog.
Reviewed By: DurhamG
Differential Revision: D20786432
fbshipit-source-id: 61c1fc601d9a6ed157c5add9748e40840b081870
Summary:
This exposes the Rust's pager to Python. Right now it's using the system
terminal.
Reviewed By: DurhamG
Differential Revision: D20887174
fbshipit-source-id: c72f31a58475e76f8097c515dd29f911d2ac4df1
Summary:
Do not convert the entire output to a string. This makes `debugindexedlog dump`
a good test case for native pager support - it takes a while to write the full
output for a large input.
Reviewed By: DurhamG
Differential Revision: D20885567
fbshipit-source-id: 35ed8f68dff1916f0833577c3cf2a52cbf2a658c
Summary:
Implement the core API to start pager in native Rust. For now it is only
enabled for the entire command if `--pager=always` is set.
Reviewed By: DurhamG
Differential Revision: D20849644
fbshipit-source-id: 860b4e18d841da607864c3447d78dbac126f5f18
Summary:
This diff turns off the support_old_nightly feature of async-trait (https://github.com/dtolnay/async-trait/blob/0.1.24/Cargo.toml#L28-L32) everywhere in fbcode. I am getting ready to remove the feature upstream. It was an alternative implementation of async-trait that produces worse error messages but supports some older toolchains dating back to before stabilization of async/await that the default implementation does not support.
This diff includes updating async-trait from 0.1.24 to 0.1.29 to pull in fixes for some patterns that used to work in the support_old_nightly implementation but not the default implementation.
Differential Revision: D20805832
fbshipit-source-id: cd34ce55b419b5408f4f7efb4377c777209e4a6d
Summary:
Instead of using Sha256::from_slice, just use Sha256::from with a correctly
sized array.
Reviewed By: quark-zju
Differential Revision: D20756181
fbshipit-source-id: 17c869325025078e4c91a564fc57ac1d9345dd15
Summary:
Update `hg status` to print errors that were returned by EdenFS's
getScmStatusV2() call, and to exit unsuccessfully if there were any errors.
Previously errors were silently ignored.
Reviewed By: quark-zju
Differential Revision: D19958503
fbshipit-source-id: cb3109df40eb86a5bf7e3818ddfb8da74d670405
Summary:
Fix the `test_status()` function to properly canonicalize the input paths, so
that it works successfully on Windows. Previously this function would fail on
Windows as some paths would end up as UNC-style paths and some would be plain
paths, causing the equality comparison to fail even though the paths were
equivalent. This canonicalizes the repo path so that they both use the same
format.
Reviewed By: quark-zju
Differential Revision: D20662921
fbshipit-source-id: fdd36bac755f9694b4a482615d3dca43ff21e05e
Summary:
All of the callers are already using an Arc, so instead of forcing the remote
store to be cloneable, and thus wrap an inner self with an Arc, let's just pass
self as an Arc.
Reviewed By: DurhamG
Differential Revision: D20715580
fbshipit-source-id: 1bef23ae7da7b314d99cb3436a94d04134f1c0e4
Summary:
When LFS will be enabled on fbsource, the enablement will rolled out server,
with the server serving pointers (or not). In the catastrophic scenario where
Mononoke has to be rolled out, the Mononoke LFS server will be unable to serve
blobs, but some clients may still have LFS pointers locally but not the
corresponding blob. For this, we need to be able to fallback to fetching the
blob via the getpackv2 protocol.
Reviewed By: DurhamG
Differential Revision: D20662667
fbshipit-source-id: 4ac45558f6d205cbd1db33c21c6fb137a81bdbd5
Summary:
The LFS server might be temporarily having issues, let's retry a bit before
giving up.
Reviewed By: DurhamG
Differential Revision: D20686659
fbshipit-source-id: 90dabd19e45a681d6eae5cd50c72b635d44c0517
Summary:
Since we have all the integer types, let's also allow float types in the
config.
Reviewed By: kulshrax
Differential Revision: D20697007
fbshipit-source-id: 21fa264d24c0f63c233f47c3bcfb2448b4c05c70
Summary:
When repacking for the purpose of file format changes, a single packfile may
contain data that needs to be moved out of it, and thus, we need to do a repack
then.
Reviewed By: DurhamG
Differential Revision: D20677442
fbshipit-source-id: c621dd2e657f5a4565b37d4b029731415b899117
Summary:
Remotestores can implement get_missing properly by simply querying the
underlying store that they will be writing to. This may prevent double fetching
some blobs in `hg prefetch` that we already have.
Reviewed By: DurhamG
Differential Revision: D20662668
fbshipit-source-id: 22140b5b7200c687e0ec723dd8879dc8fbea6fb9
Summary:
There are cases where the user of the abstraction needs to know if this is a
local store, this will simplify the caller code.
Reviewed By: DurhamG
Differential Revision: D20662666
fbshipit-source-id: e0bde7eb0dc3484979732a7c4cdf888fedc70e13
Summary:
By regularly flushing the blob store, we avoid keeping too many LFS blobs in
memory, which could cause OOM issues.
The default size is chosen to be 1GB, but is configurable for more control.
Reviewed By: DurhamG
Differential Revision: D20646213
fbshipit-source-id: 12c06fd0212ef3974bea10c82026b6e74fb5bf21
Summary:
In the legacy lfs extension, LFS blobs were stored as loosefiles on disk, and
as we saw with loosefiles for remotefilelog, they can incur a significant
overhead to maintain. Due to LFS blobs being large by definition, the number of
loose LFS blobs should be reasonable for repack to walk over all of them to
chose which one to throw away.
A different approach would be to simply store the blobs in an on-disk format
that allows automatic size management, and simple indexing. That format is an
IndexedLog. This of course doesn't come without drawbacks, the main one being
that the IndexedLog API mandate that the full blob is present on insertion,
preventing streaming writes to it, the solution is to simply chunk the blobs
before writing them to it. While proper streaming is not done just yet, the
storage format no longer prevent it from being implemented.
Reviewed By: DurhamG
Differential Revision: D20633783
fbshipit-source-id: 37a88331e747cf22511aa348da2d30edfa481a60
Summary:
RotateLog loads older logs lazily. If an older log is broken, remember that and avoid
loading the broken log again.
Reviewed By: DurhamG
Differential Revision: D20663899
fbshipit-source-id: 7a4b5279cc6387c19329a51048bfe1be2e0bc1f8
Summary:
Due to the Mononoke LFS server only being available on FB's network, the tests
using them cannot run outside of FB, including in the github workflows.
Reviewed By: quark-zju
Differential Revision: D20698062
fbshipit-source-id: f780c35665cf8dc314d1f20a637ed615705fd6cf
Summary:
The IdDag provides graph algorithms using Segments.
The IdMap allows converting from the SegmentedChangelogId domain to the
ChangesetId domain.
The Dag struct wraps IdDag and IdMap in order to provide graph algorithms using
the common application level identifiers for commits (ChangesetId).
The construction of the Dag is currently mocked with something that can only be
used in a test environment (unit tests but also integration tests).
This diff also implements a location_to_name function. This is the most
important new functionality that segmented changelog clients require. It
recovers the hash of a commit for which the client only has a segmented
changelog Id. The current assumption is that clients have identifiers for all
merge commit parents so the path to a known commit always follow a set
of first parents.
The IdMap queries will have to be changed to async in the future, but IdDag
queries we expect to stay sync.
Reviewed By: quark-zju
Differential Revision: D20635577
fbshipit-source-id: 4f9bd8dd4a5bd9b0de55f51086f3434ff507963c
Summary: The interesting observation is that InProcessStore is not public.
Reviewed By: quark-zju
Differential Revision: D20635578
fbshipit-source-id: a0149929c8059ff77f047fd385bf3b26dc738dfd
Summary:
One of the main drawback of the current version of repack is that it writes
back the data to a packfile, making it hard to change file format. Currently, 2
file format changes are ongoing: moving away from packfiles entirely, and
moving from having LFS pointers stored in the packfiles, to a separate storage.
While an ad-hoc solution could be designed for this purpose, repack can
fullfill this goal easily by simply writing to the ContentStore, the
configuration of the ContentStore will then decide where this data will
be written into.
The main drawback of this code is the unfortunate added duplication of code.
I'm sure there is a way to avoid it by having new traits, I decided against it
for now from a code readability point of view.
Reviewed By: DurhamG
Differential Revision: D20567118
fbshipit-source-id: d67282dae31db93739e50f8cc64f9ecce92d2d30
Summary:
While the primary (for now) way of addressing an LFS blob is via its sha256,
being able to address them via different hash schemes (sha1 for Eden/Buck,
blake2, etc) will be helpful down the line. Thus, let's store a HashMap of
ContentHash in the pointer store.
Reviewed By: DurhamG
Differential Revision: D20560197
fbshipit-source-id: 8bdc4fc4cd7fc19c7eed6a27d11953c4eedf9195
Summary: No locking is required for this one due to being loose files on disk.
Reviewed By: DurhamG
Differential Revision: D20522890
fbshipit-source-id: 72b7ebc063060a89f54976a1128977a3b7501053
Summary:
Instead of having the magic number 0x2000 all over the place, let's move the
logic to this method.
Reviewed By: DurhamG
Differential Revision: D20637749
fbshipit-source-id: bf666f8787e37e6d6c58ad8982a5679b7e3e717b
Summary:
`iter_segments_with_parent` has a few more conditions attached to it than the
name would imply. We are renaming it to give a better sense of its true
behavior.
Reviewed By: quark-zju
Differential Revision: D20547631
fbshipit-source-id: 406f46b9de5efc9e8e6a8c4bc22ab18fa5bc54bb
Summary:
The main question I had while writing the tests was whether we expect a
specific order for Segments for `iter_segments_with_parent`. `InProcessStore`
will return the segments in the order that they were inserted.
Reviewed By: quark-zju
Differential Revision: D20501401
fbshipit-source-id: 48ceb78f3191c7425c1488a3392cf3167f7e7268
Summary:
First 6 methods implemented from the IdDagStore trait for the InProcessStore.
Any suggestions welcome.
Reviewed By: quark-zju
Differential Revision: D20499228
fbshipit-source-id: cb536a3a0136077ada78934d82a25d079a5bc809
Summary:
Replace `rust-crypto` with `hex`, `sha-1`, `sha2`.
- `crypto::sha1::Sha1` with `sha1::Sha1`
- `crypto::sha2::Sha2` with `sha2::Sha2`
- `crypto::digest::Digest` with `sha1::Digest` and `sha2::Digest`
- `.result_str()` with `hex::encode` and `.result()`
Reviewed By: jsgf
Differential Revision: D20588313
fbshipit-source-id: 75c4342e8b6285f0f960f864c21457a1a0808f64
Summary:
In a strongly typed langage, using strings should be avoided whenever possible
as they do not provide the safety guarantees that types provide.
I took the liberty of removing all the filesystems that are not relevant for
Mercurial for simplification reasons. If needs arise, we can always add a new
FsType to the enum.
Reviewed By: DurhamG
Differential Revision: D20517138
fbshipit-source-id: 0a38b53c6a87f05f4b2d664038e10c4293de96ae
Summary:
Replace `rust-crypto` with `sha-1`:
- `crypto::digest::Digest` with `sha1::Digest`
- `crypto::Sha1` with `sha1::Sha1`
The interface changes slightly - no need to pass a mutable byte array when
getting the result.
Reviewed By: jsgf
Differential Revision: D20587638
fbshipit-source-id: c6c737f3f8eba94b98c728e198eb4fac12c5c80b
Summary:
Swap out `rust-crypto` for `sha-1`
- `crypto::sha1::Sha1` is replaced by `sha1::Sha1`
- `crypto::digest::Digest` is replaced by `digest::Digest`
Reviewed By: jsgf
Differential Revision: D20587685
fbshipit-source-id: 971fdaa8ce5b3e9e60db219131f6c36dcbc213d9
Summary:
Switched out the `sha` package for the `rust-crypto` package. The
apis aren't an exact match, so I had to insert a clone in place of
a modification to a mutable reference.
Reviewed By: jsgf
Differential Revision: D20585336
fbshipit-source-id: 22245157aea1115ae6f225b17b0346f0696653f7
Summary:
According to the anyhow documentation[0], the behavior of `.to_string()` is to
only stringify the top-level errors, hiding all the context of the error.
Instead, the debug format allows all the context to be displayed, and, if
available the backtrace.
This should significantly help debug Rust errors when context is available,
which we should strive to have everywhere!
[0]: https://docs.rs/anyhow/1.0.27/anyhow/struct.Error.html#display-representations
Reviewed By: sfilipco
Differential Revision: D20575944
fbshipit-source-id: 2968d7fb755edec7f7e5151138e8049ded181c1b
Summary: The signatures were used by the linter to warn if the files require regenerating, since the linter now regenerates the files regardless of the signature it is no longer needed to sign the files.
Reviewed By: krallin
Differential Revision: D20467745
fbshipit-source-id: aff2643f80939d5693e7a30abf07484c9060796f
Summary:
This is only intended for Mercurial .t tests and not in any production
environment.
Reviewed By: DurhamG
Differential Revision: D20504236
fbshipit-source-id: 618e17631b73afa650875cb7217ba7c55fb9f737
Summary:
For now, this is only used for LFS, as this is the only store that can
correctly answer both.
This API will be exposed to Python to be able to have cheap filectx comparison,
and other use cases.
Reviewed By: DurhamG
Differential Revision: D20504234
fbshipit-source-id: 0edb912ce479eb469d679b7df39ba80fceef05f2
Summary:
This enables fetching blobs from the LFS server. For now, this is limited to
fetching them, but the protocol specify ways to also upload. That second part
will matter for commit cloud and when pushing code to the server.
One caveat to this code is that the LFS server is not mocked in tests, and thus
requests are done directly to the server. I chose very small blobs to limit the
disruption to the server, by setting a test specific user-agent, we should be
able to monitor traffic due to tests and potentially rate limit it.
Reviewed By: DurhamG
Differential Revision: D20445628
fbshipit-source-id: beb3acb3f69dd27b54f8df7ccb95b04192deca30
Summary:
This is the start of migrating blackbox events to tracing events. The
motivation is to have a single data source for log processing (for simplicity)
and the tracing data seems a better fit, since it can represent a tree of
spans, instead of just a flat list. Eventually blackbox might be mostly
a wrapper for tracing data, with some minimal support for logging some indexed
events.
Reviewed By: DurhamG
Differential Revision: D19797710
fbshipit-source-id: 034f17fb5552242b60e759559a202fd26061f1f1
Summary:
Now Segment has no lifetime we can create it directly and return the ownership.
Performance of "building segments" does not seem to change:
# before
building segments 750.129 ms
# after
building segments 712.177 ms
Reviewed By: sfilipco
Differential Revision: D20505200
fbshipit-source-id: 2448814751ad1a754b90267e43262da072bf4a16
Summary:
This allows structures like BTreeMap to own and store Segment.
It was not possible until D19818714, which adds minibytes::Bytes interface for
indexedlog.
In theory this hurts performance a little bit. But the perf difference does not
seem visible by `cargo bench --bench dag_ops`:
# before
building segments 714.420 ms
ancestors 54.045 ms
children 490.386 ms
common_ancestors (spans) 2.579 s
descendants (small subset) 406.374 ms
gca_one (2 ids) 161.260 ms
gca_one (spans) 2.731 s
gca_all (2 ids) 287.857 ms
gca_all (spans) 2.799 s
heads 234.130 ms
heads_ancestors 39.383 ms
is_ancestor 113.847 ms
parents 251.604 ms
parent_ids 11.412 ms
range (2 ids) 117.037 ms
range (spans) 241.156 ms
roots 507.328 ms
# after
building segments 750.129 ms
ancestors 53.341 ms
children 515.607 ms
common_ancestors (spans) 2.664 s
descendants (small subset) 411.556 ms
gca_one (2 ids) 164.466 ms
gca_one (spans) 2.701 s
gca_all (2 ids) 290.516 ms
gca_all (spans) 2.801 s
heads 240.548 ms
heads_ancestors 39.625 ms
is_ancestor 115.735 ms
parents 239.353 ms
parent_ids 11.172 ms
range (2 ids) 115.483 ms
range (spans) 235.694 ms
roots 506.861 ms
Reviewed By: sfilipco
Differential Revision: D20505201
fbshipit-source-id: c34d48f0216fc5b20a1d348a75ace89ace7c080b
Summary:
The later is what is now recommended, and no longer requires a macro to
initialize a lazy value, leading to nicer code.
Reviewed By: DurhamG
Differential Revision: D20491488
fbshipit-source-id: 2e0126c9c61d0885e5deee9dbf112a3cd64376d6
Summary:
Lots of different warnings on this one. Main ones were:
- One bug where .write was used instead of .write_all
- Using .next instead of .nth(0) for iterators,
- Using .cloned() instead of .map(|x| x.clone())
- Using conditions as expressions instead of mut variables
- Using .to_vec() on slices instead of .iter().cloned().collect().
- Using .is_empty instead of comparing .len() against 0.
Reviewed By: DurhamG
Differential Revision: D20469894
fbshipit-source-id: 3666a44ad05e0fbfa68d490595703c022073af63
Summary:
These were from a wide variety of warnings. The only one I haven't addressed is
that clippy complains that Pin<Box<Vec<u8>>> can be replaced by Pin<Vec<u8>>. I
haven't investigated too much into it, someone more familiar with this code can
probably figure out if this is buggy or not :)
Reviewed By: DurhamG
Differential Revision: D20469647
fbshipit-source-id: d42891d95c1d21b625230234994ab49bbc45b961
Summary:
This belongs to D20149376. However buck test does not include benchmarks so it
was not noticed.
Reviewed By: DurhamG
Differential Revision: D20505097
fbshipit-source-id: 24daeb17b68808f8e69e18452ab2cf26c7aa10a7
Summary:
The mutation store stores entries with a floating-point timestamp. This
pattern was copied from obsmarkers.
However, Mercurial uses integer timestamps in the commit metadata (the
parser supports floats for historical reasons, but only stores integer
timestamps). Mononoke also uses integer timestamps in its `DateTime`
type.
To keep things simple, switch to using integer timestamps for mutation
entries. Existing entries with floating point timestamps are truncated.
Add a new entry format version that encodes the timestamp as an integer.
For now, continue to generate the old version so that old clients can
read entries created by new clients.
Reviewed By: quark-zju
Differential Revision: D20444366
fbshipit-source-id: 4d6d9851aacb314abea19b87c9d0130c47fdf512
Summary:
Tracking the origin of mutation entries did not prove useful, and just creates
an un-necessary overhead. Remove the tracking and repurpose the field as a
version field.
Reviewed By: quark-zju
Differential Revision: D20444365
fbshipit-source-id: 65ff11ee8cfe77d5e67a83d03a510541d58ef69b
Summary: Using ptr.add is shorter and preferred to ptr.offset.
Reviewed By: quark-zju
Differential Revision: D20452752
fbshipit-source-id: 1dc2fdbc392267d2d690673c10dcc161ecd00dfa
Summary:
These warnings are fairly trivial, as it recommends using single quote (char)
for single characters search instead of a double quote (str).
Reviewed By: quark-zju
Differential Revision: D20452408
fbshipit-source-id: b2951e133e57633a8e766536e22969fa9ac0ecee
Summary:
Clippy had 3 sources of warnings in this crate:
- from_str method not in impl FromStr. We still have 2 of them in path.rs, but
this is documented as not supported by the FromStr trait due to returning a
reference. Maybe we can find a different name?
- Use of mem::transmute while casts are sufficient. I find the cast to be
ugly, but they are simply safer as the compiler can do some type checking on
them.
- Unecessary lifetime parameters
Reviewed By: quark-zju
Differential Revision: D20452257
fbshipit-source-id: 94abd8d8cd76ff7af5e0bbfc97c1e106cdd142b0
Summary:
Clippy complains about 3 things:
- Using raw pointers in a public function that is not declared as unsafe. This
happens for C exported ones, this feels like a warning, so I haven't changed
it.
- Using .map(...).unwrap_or(<default value constructed>). The recommendation
is to use .unwrap_or_default().
- Single match instead of if let, the latter makes code much shorter.
Reviewed By: quark-zju
Differential Revision: D20452751
fbshipit-source-id: 8eeff7581c119c651ca41d8117f1f70f15774833
Summary:
Right now the module has one implementation IndexedLogStore. The name could
be more specific in the context of the crate.
The goal will be to add a trait for storage requirements of IdDag and
make IndexedLogStorage one implementation of that trait.
Reviewed By: quark-zju
Differential Revision: D20446042
fbshipit-source-id: 7576e1cc4ad757c1a2c00322936cc884838ff710
Summary:
Purge needs to be able to see what directories the walker traversed, so
it can delete them if they are empty. Instead of having the walker call
match.traversedir (which it seems like a bizarre pattern to use the matcher as a
holder for a non-matching related function), let's have the walker return an
enum and have an option to return directories.
At the python layer we then translate this into match.traversedir calls, but we
can clean that up later.
Reviewed By: quark-zju
Differential Revision: D19543795
fbshipit-source-id: cc51c86c91799d3df2c65d25a7b6cfe810206d0a
Summary:
In preparation for supporting returning directories from the walker (to
support purge), let's rename the result structure to be more generic.
Reviewed By: kulshrax
Differential Revision: D19543791
fbshipit-source-id: 9b71452c879cf397ae92533a4ef4727140ac7369
Summary:
The mercurial tests print errors when they encounter 'fifo' files.
Let's handle that case.
Differential Revision: D19543796
fbshipit-source-id: f87d4b9c3f0ad8b8d8ebe2e6d18e325fc93d0ae9
Summary:
While the sha256 of a blob gives access to its content, it doesn't allow
accessing its metadata, by adding a sha256 index, we can easily get the
metadata of a blob via its content hash.
Reviewed By: quark-zju
Differential Revision: D20445624
fbshipit-source-id: 42c04bd69d3c7380706c6237c5b4f4061c016cca
Summary: This is necessary to properly test LFS stores.
Reviewed By: quark-zju
Differential Revision: D20445625
fbshipit-source-id: 530ddf87249e8d721957806f2d8edef3262f303c
Summary:
The OpenOptions allow for multiple indices to be added, but lookup had no way
to querying these multiple indices.
Reviewed By: quark-zju
Differential Revision: D20445627
fbshipit-source-id: 0cb754ba17b452d892b7bcb56d502d5753ef963a
Summary:
This type can either be a Mercurial type key, or a content hash based key. Both
the prefetch and get_missing now can handle these properly. This is essential
for stores where data can either be fetched in both ways or when the data is
split in 2. For LFS for instance, it is possible to have the LFS pointer (via
getpackv2), but not the actual blob. In which case get_missing will simply
return the content hash version of the StoreKey, to signify what it actually
has missing.
Reviewed By: quark-zju
Differential Revision: D20445631
fbshipit-source-id: 06282f70214966cc96e805e9891f220b438c91a7
Summary:
Similarly to the DataStore trait, this makes it easier to understand that they
deal with a Mercurial type Key.
Reviewed By: quark-zju
Differential Revision: D20445621
fbshipit-source-id: a1143d5f5d6a2c8686d517a6ea3c25b07c0df072
Summary: This makes it clear that these traits are dealing with Mercurial Key.
Reviewed By: quark-zju
Differential Revision: D20445626
fbshipit-source-id: d5acbf442e9407b973e95e40af69b5a61bff0a4d
Summary:
Since configparser enforces utf-8 config files (because pest wants Rust strings),
let's migrate from Bytes to Text to remove extra encoding conversions.
Previously this was blocked by the lack of ref-counted text (since the "source"
of each config location is the entire config file). Now minibytes provides Text
so we can use it.
This unfortunately requires dependent code to be updated. The pyconfigparser
interface is in theory wrong - it shouldn't return utf-8 bytes but
local-encoded bytes. I think it's cleaner to make pyconfigparser unaware of
HGENCODING, so I changed pyconfigparser to use unicode, and add compatibility
layer in uiconfig.py.
This also fixes non-ascii encoding issues on user name (especially on Windows).
The hgrc config file should be in utf-8 and the config parser returns explicit
unicode types, and Python code round-trip them with local encodings.
Reviewed By: markbt
Differential Revision: D20432938
fbshipit-source-id: b1359429b8f1c133ab2d6b2deea6048377dfeca1
Summary:
This makes it easier to further migrate to `Text` interface.
Dependent crate (`auth`) is updated.
Reviewed By: markbt
Differential Revision: D20432941
fbshipit-source-id: 1dc29d52c9b17ce14676ef0555470c6d36a09c2b
Summary:
Text is a reference-counted shared String.
It's similar to Bytes but works for utf-8 strings.
The motivation is to replace configparser's use of Bytes to Text.
Reviewed By: markbt
Differential Revision: D20432940
fbshipit-source-id: ef990255d269e60d433c6520819f60ccdcbe488f
Summary: This makes it possible to implement "Text". See the next diff.
Reviewed By: markbt
Differential Revision: D20432943
fbshipit-source-id: 94b3810ab205c260d33f57bd637e4accc3ee871d
Summary:
This makes the API easier to use.
Practically this makes it easier for configparser to migrate to minibytes.
Reviewed By: markbt
Differential Revision: D20432942
fbshipit-source-id: ad08eb118d2216054dc24c86b0b129ae82b9d17c
Summary:
Previously Rust str was serialized into bytes. To be Python 3 friendly, let's
serialize it into `str`.
Reviewed By: markbt
Differential Revision: D19797706
fbshipit-source-id: 388eb044dc7e25cdc438f0c3d6fa5a5740f22e3d
Summary:
The goal of the stack is to support "rendering" diffs for large files in scs
server. Note that rendering is in quotes - we are fine with just showing a
placeholder like "Binary file ... differs". This is still better than the
current behaviour which just return an error.
In order to do that I suggest to tweak xdiff library to accept FileContentType
which can be either Normal(...) meaning that we have file content available, or
Omitted, which usually means the file is large and we don't even want to fetch it, and we
just want xdiff to generate a placeholder.
Reviewed By: markbt, krallin
Differential Revision: D20389226
fbshipit-source-id: 0b776d4f143e2ac657d664aa9911f6de8ccfea37
Summary:
This will be used in the Python world for legacy reasons. It shouldn't be used
in new Rust node.
To use it, the name `LegacyCodeNeedIdAccess` has to be used so we can do a code
search to find all users of it.
Reviewed By: sfilipco
Differential Revision: D20367834
fbshipit-source-id: 9b93a29f1461ce24bba6f31a2bbb1f327e216c6d
Summary: This will be useful to actually sort commits.
Reviewed By: sfilipco
Differential Revision: D20367835
fbshipit-source-id: 43bc7835277af3a14ef323ce34247e0c03878dc8
Summary:
The old "AllSet" implementation is not very practical - it does not support
iteration. Practically, the "all()" set comes from the DAG. Change the "all"
concept to a hint similar to "is_topo_sorted", and update the fast path
(intersection) accordingly.
Reviewed By: sfilipco
Differential Revision: D20367837
fbshipit-source-id: fdbf370897c93058bfcab0571c1f6fa4b99b0f6b
Summary: The word "snapshot" more accurately describes its purpose.
Reviewed By: sfilipco
Differential Revision: D20367836
fbshipit-source-id: c91a0bd402fa1718b5d805beedc0e062824c53d3
Summary:
Without this:
In [3]: util.getfstype('')
IOError: [Errno 2] No such file or directory (os error 2)
And there is a code path hitting this:
File "edenscm/mercurial/util.py", line 1483, in checknlink
fstype = getfstype(os.path.dirname(testfile))
# testfile = '.'
# os.path.dirname(".") = ""
The old implementation works fine for an empty path:
In [2]: m.util.getfstype('')
Out[2]: 'eden'
So let's make the new Rust implementation consistent.
Reviewed By: xavierd
Differential Revision: D20313387
fbshipit-source-id: 258c424a3e8a796d983e20b0d4656e8e3f413706
Summary: Similar to D13982877. Try to get names like "fuse.ntfs".
Reviewed By: farnz
Differential Revision: D20313392
fbshipit-source-id: 8363d3d92843e6afb53a0003950be083034bd841
Summary:
Only keep type parameters at the top-level function.
This reduces the binary size and speeds up rustc.
Reviewed By: xavierd
Differential Revision: D20313388
fbshipit-source-id: 29d77731ff462fee1f1bb9f234601e3430198ae7
Summary: This makes the code a bit more portable.
Reviewed By: xavierd
Differential Revision: D20313389
fbshipit-source-id: 080538939fa4d2d72e5905f23ad9be987d952748
Summary:
Rename the main method to "fstype". The API has no relation with repo.
So let's rename it.
Reviewed By: xavierd
Differential Revision: D20313386
fbshipit-source-id: 80dd1231ccccfe945150b117b151bce773f0dfeb
Summary:
Since the mocked memcache is shared between the tests, we need to make sure the
keys used by the tests are different, otherwise they are just caching each
others data.
Reviewed By: ikostia
Differential Revision: D20388783
fbshipit-source-id: 0f2f926e0ffe0e52e55291e46142808ce0921288
Summary:
Some `use`s are not used on Windows. The code was also formatted using the
latest rustfmt.
Reviewed By: xavierd
Differential Revision: D20379704
fbshipit-source-id: ffadcd68e4e0440dcbd2a4e1ad8532b47a9d83e2
Summary: Similarly to the ContentStore, remove the Arc from MetadataStore.
Reviewed By: quark-zju
Differential Revision: D20376838
fbshipit-source-id: 4321600b752c919b6d9fa7bdee6f6cb7ae083b10
Summary:
The clients should use an Rc/Arc if they need the ability to clone it. This
makes it more obvious and reduces the number of pointer indirection.
Reviewed By: quark-zju
Differential Revision: D20376839
fbshipit-source-id: c56e7e8f89ab17727be621894c329e344a7f3adb
Summary:
The dag crate is designed to work with any kind of binary commit hashes (ex. bonsai,
git or hg). The only use of `types` is to convert from binary to hex. Since dag
already has its own `to_hex` logic in `VertexName`. Let's use that instead.
Reviewed By: sfilipco
Differential Revision: D20378447
fbshipit-source-id: 00ecb551ea927fdb60dd91e5e645064f23139bcd
Summary:
Recently there are some Windows-related test flakiness in . All of them are
caused by `file.persist(path)` in `atomic_write_plain` failing with
"Access Denied". Since that can be caused by Windows Anti-Virus scans or other
weird stuff, let's workaround around it using automatically retires.
Process Explorer does not provide extra information:
indexedlog-d0c6135fd7ed9ece.exe 5868 SetRenameInformationFile C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\.tmpcfDsQQ ACCESS DENIED ReplaceIfExists: True, FileName: C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\meta
A successful rename looks like:
indexedlog-d0c6135fd7ed9ece.exe 5868 SetRenameInformationFile C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\.tmpbXEVw0 SUCCESS ReplaceIfExists: True, FileName: C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\meta
Reviewed By: ikostia
Differential Revision: D20379618
fbshipit-source-id: db3e6be3d785875486f7a517df11cbf58bf65ddd
Summary:
Now that the ContentStore can automatically strip the metadata header, no need
for duplicated code in the backingstore.
Reviewed By: fanzeyi
Differential Revision: D20376812
fbshipit-source-id: e863e1cc2fcdc8b9e612a464b305fa25ceb66e13
Summary:
During `hg update`, Mercurial forks multiple processes to write files on disk
concurrently, this is done as fetching blobs from the content store, and
writing them to disk is CPU bound. Usually, threads would be the preferred way
of speeding up such process, but unfortunately, Python has GIL that severely
limit the available concurrency. So, multiple processes were chosen.
Unfortunately, the multi-process solution also brings a lot of other issues,
more recently, we've had cases where the connections to the server and memcache
had to be dropped after the fork. In some other cases, this caused deadlocks.
And the solution is not effective on Windows.
Now that Mercurial is getting more and more Rust, we could instead go back to
the threads solution by using them in Rust, and have Python just push work to
them, this is exactly what this change does.
Things that are left to be done, but I wanted to get a diff out first:
- no file path audit
- no file backup
- no symlink creation
- probably other things I'm missing
Reviewed By: quark-zju
Differential Revision: D20102888
fbshipit-source-id: d47829fd7818b97710586b9851880f178048e27b
Summary:
With this new store, blobs will be transparently written to either an LFS
store, or a non-LFS one, depending on their size.
Initially, and as long as getpackv2 is supported, we also need to support
parsing lfs pointer data that the server is sending and write these to the lfs
pointer store. This code is very adhoc and does manual parsing of the pointer
data, definitively not great, suggestion for a simple and better solution is
welcome :).
From a migration standpoint, the read-only LFS stores are added to the
ContentStore, this allows blobs written in it to be readable at all time even
when `remotefilelog.lfs` isn't set. The code will effecitvely be dormant for a
while until the option is turned on, if we need to disable it, the dormant code
will still be able to read all the blobs written to disk. This forces us to
deploy a release that contains this code to stable first, before setting
`remotefilelog.lfs`.
Reviewed By: quark-zju
Differential Revision: D19986878
fbshipit-source-id: 260f5a542d52e748c0c703bfa7bb8ffac0e7b388
Summary: This makes `RUST_LOG` work for indexedlog tests.
Reviewed By: xavierd
Differential Revision: D20286515
fbshipit-source-id: ff4a1476eb01a9067dabe3622fd598f65fe86a18
Summary:
The tracing / env_logger integration works for hg as a binary. However I'd also
like to use it in library tests. This crate makes it easier to do so.
Reviewed By: xavierd
Differential Revision: D20286507
fbshipit-source-id: f5bf3288ce950591ddfe64b524ad51ce21ee4099
Summary: Those has helped me debugging some issues.
Reviewed By: xavierd
Differential Revision: D20286513
fbshipit-source-id: 012ddb16c2d0efd8f8697a5ecd4564ea31d65630
Summary: Move the scope of spans so the exit code is shown.
Reviewed By: xavierd
Differential Revision: D20286516
fbshipit-source-id: f39cbf60c86ea19a1bb0a09958748f04ff6a42e8
Summary:
Previously env_logger is only initialized if Python is initialized.
This diff makes env_logger initialized for Rust native commands.
Reviewed By: xavierd
Differential Revision: D20286517
fbshipit-source-id: 18fee96c2b41db1da9648d615d1e18809de90a63
Summary:
This means crates like env_logger (which reads $RUST_LOG, and writes to stderr)
can be used for convenient debugging.
Reviewed By: xavierd
Differential Revision: D20286514
fbshipit-source-id: e3b80cc4830ba5cc6dbf7aa1cbb92a4f4f046a54
Summary:
Those metadata include module_path, target, line number, etc, in Rust native
format. They will be used for the upcoming `log` integration.
Reviewed By: xavierd
Differential Revision: D20286510
fbshipit-source-id: 27019b941bef08c0bb3e505bbdae642282dcb141
Summary:
Spliting lock file acquisition from `IdDag::prepare_filesystem_sync` to its own
function.
Useful when looking ahead to split IdDag from IndexedLog.
Reviewed By: quark-zju
Differential Revision: D20316443
fbshipit-source-id: a0fd43439730376920706bb4349ce497f6624335
Summary:
This removes an inline use of the indexedlog indexes.
This is going to be useful when we try to separate IndexedLog specifics from
IdDag functionality.
Reviewed By: quark-zju
Differential Revision: D20316058
fbshipit-source-id: 942a0a71660bb327376c81fd3ac435d002ecca6e
Summary:
Instead of returning `anyhow::Error` wrapping an `ErrorKind` enum
from each Thrift client method, just return an error type specific
to that method. This will make error handling simpler and less
error-prone by removing the need to downcast the returned error.
This diff also removes the `ErrorKind` enums so that we can be sure
that there are no leftover places trying to downcast to them.
(Note: this ignores all push blocking failures!)
Reviewed By: dtolnay
Differential Revision: D20260398
fbshipit-source-id: f0dd96a7b83dd49f6b30948660456539012f82e6
Summary:
The old code does "read, lock, write", which is unsound because after "lock"
the data just read can be outdated and needs a reload.
Reviewed By: xavierd
Differential Revision: D20306137
fbshipit-source-id: a1c29d5078b2d47ee95cf00db8c1fcbe3447cccf
Summary:
I thought the index function could be the bottleneck. However, the Log reading
(xxhash, decoding vlqs) can be much slower for very long entries. Therefore
using bytes as the lag threshold is better. It does leaked the Log
implementation details (how it encodes an entry) to some extend, though.
Reverts D20042045 and D20043116 logically. The lagging calculation is using
the new Index::get_original_meta API, which is easier to verify correctness
(In fact, it seems the old code is wrong - it might skip Index flushes if
sync() is called multiple times without flushing).
This should mitigate an issue where a huge entry (generated by `hg trace`) in
blackbox does not get indexed in time and cause performance regressions.
Reviewed By: DurhamG
Differential Revision: D20286508
fbshipit-source-id: 7cd694b58b95537490047fb1834c16b30d102f18
Summary: This will be used to more reliably detect index lags.
Reviewed By: DurhamG
Differential Revision: D20286518
fbshipit-source-id: c553b6587363a55603b75df12580588e3100e35f
Summary:
This ensures indexes are complete even if index format or definition has been
changed.
Reviewed By: DurhamG
Differential Revision: D20286509
fbshipit-source-id: fcc4ebc616a4501e4b6fd2f1a9826f54f40b99b8
Summary:
This avoids loading all blackbox logs when `init()` gets called multiple times
(for example, once in Rust and once in Python).
Reviewed By: DurhamG
Differential Revision: D20286511
fbshipit-source-id: ef985e454782b787feac90a6249651a882b6552e
Summary: This API has the benefit that it does not trigger loading older logs.
Reviewed By: DurhamG
Differential Revision: D20286512
fbshipit-source-id: 426421691ad1130cdbb2305612d76f18c9f8798c
Summary:
With the new crate-public interfaces and Debug implementations it's possible to
write tests for DagSet. So let's do it.
Reviewed By: sfilipco
Differential Revision: D20242561
fbshipit-source-id: 180e04d9535f79471c79c4307f6ab6e8e8815067
Summary:
Don't restrict constructing a c_api datapack store to only Unix, we can
construct it on Windows too by assuming that their path will be valid UTF-8.
Reviewed By: quark-zju
Differential Revision: D20250718
fbshipit-source-id: 07234b6a71b50c803cfe3b962fa727f57037c919
Summary: This returns the ancestors in the reverser order as the parents method.
Reviewed By: sfilipco
Differential Revision: D20265277
fbshipit-source-id: 83277cee3d8e9070fc56d20d4c1877e6782c22f7
Summary: Those will be reused by nameset::DagSet.
Reviewed By: sfilipco
Differential Revision: D20242563
fbshipit-source-id: 944e9a04aeb15439256ecea64355b67e326e5c89
Summary:
This is useful for `assert_eq!(format!("{:?}", set), "...")` tests.
It will be eventually exposed to Python as `__repr__`, similar to Python's
smartsets.
Reviewed By: sfilipco
Differential Revision: D20242562
fbshipit-source-id: 5373bb180db7cafebf273ace7cf2cb80fbfb8038
Summary:
In the Python world all smartsets have some kind of "debug" information. Let's
do something similar in Rust.
Related code is updated so the test is more readable.
Reviewed By: sfilipco
Differential Revision: D20242564
fbshipit-source-id: 7439c93d82d5d037c7167818f4e1125c5a1e513e
Summary:
Previously, `flush()` will skip writing the file if there are only metadata
changes. Fix it by detecting metadata changes.
This can potentially fix an issue that certain blackbox indexes are empty,
lagging and require scanning the whole log again and again. In that case,
the index itself is not changed (the root radix entry is not changed), but
only the metadata tracking how many bytes in Log the index covered
changed.
Reviewed By: sfilipco
Differential Revision: D20264627
fbshipit-source-id: 7ee48454a92b5786b847d8b1d738cc38183f7a32
Summary:
Using `if cfg!` instead of `#[cfg]` allows for the compiler to understand
that the arguments aren't unused, and silence the warnings.
Reviewed By: quark-zju
Differential Revision: D20242280
fbshipit-source-id: 332dfe17b3a80a1096d15c91c9fb6644bd10e0cd
Summary:
Compiling it on Windows produced a bunch of warning due to
`hgrc_configset_load_path` not being compiled on it. Fixed it so it no longer
depends on Unix specific imports.
Reviewed By: quark-zju
Differential Revision: D20241102
fbshipit-source-id: 3002f961191fbb9bc51aa9ac1154d6d50bd7fe23
Summary:
The `.into_iter()` for this object is being deprecated and won't compile in
the future, fix it now.
Reviewed By: quark-zju
Differential Revision: D20241103
fbshipit-source-id: fdee463ed81cd07a65f3cc4c70a96c88928b3b87
Summary:
While compiling on Windows, this file issues a bunch of warnings, use `if
cfg!` instead of `#[cfg]` to silence them. The behavior is the same, but the
later allows the compiler to recognize that some is not unused.
Reviewed By: quark-zju
Differential Revision: D20241104
fbshipit-source-id: 2cd7f171c7a2f7220cc73bea9be3359260de19b2
Summary:
The change is in theory not necessary. However it improves the reliability on
OS crashes a bit, and can potentially workaround some bugs in filesystems
(as we saw in production where the atomic-written files are empty and the
system didn't crash).
The idea is, the `symlink` syscall does the file creation and "content" writing
together, while there is no way to create a file and write specific content
in one syscall. Note that the C symlink call uses 0-terminated string, and
the Rust stdlib exports it as accepting `Path`. To be safe, we encode binary
or non-utf8 content using `hex`.
For downgrade safety, the write path does not use symlink by default unless
format.use-symlink-atomic-write is set to true. This makes downgrade possible:
the read path is rolled out first, then we can turn on and off the write path.
The indexedlog Rust unit tests and test-doctor.t are migrated to use the new
symlink code paths.
Reviewed By: DurhamG
Differential Revision: D20153864
fbshipit-source-id: c31bd4287a8d29575180fbcf7227d2b04c4c1252
Summary:
This makes it possible to implement atomic_write differently (ex. use a
symlink).
Reviewed By: DurhamG
Differential Revision: D20153865
fbshipit-source-id: 07fa78c2f2dac696668f477c75f65cf70950b73f
Summary:
This makes it clear that `log` is a math concept, not an append-only file like
`Log`.
Reviewed By: DurhamG
Differential Revision: D20149376
fbshipit-source-id: 67d2e9584b15f48759ca9b6dfce4279a5b1365a0
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/
This codemod replaces *all* dependencies on `//common/rust/renamed:futures-preview` with `fbsource//third-party/rust:futures-preview` and their uses in Rust code from `futures_preview::` to `futures::`.
This does not introduce any collisions with `futures::` meaning 0.1 futures because D20168958 previously renamed all of those to `futures_old::` in crates that depend on *both* 0.1 and 0.3 futures.
Codemod performed by:
```
rg \
--files-with-matches \
--type-add buck:TARGETS \
--type buck \
--glob '!/experimental' \
--regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
-x \
buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:futures-preview, 1))" \
| xargs sed -i 's,\bfutures_preview::,futures::,'
rg \
--files-with-matches \
--type-add buck:TARGETS \
--type buck \
--glob '!/experimental' \
--regexp '(_|\b)rust(_|\b)' \
| xargs sed -i 's,//common/rust/renamed:futures-preview,fbsource//third-party/rust:futures-preview,'
```
Reviewed By: k21
Differential Revision: D20213432
fbshipit-source-id: 07ee643d350c5817cda1f43684d55084f8ac68a6
Summary:
Also patch aho-corasick to fix the issue.
The issue was introduced by [an optimization path](063ca0d253) added in aho-corasick 0.7 series (used by globset 0.4.3).
aho-corasick 0.6.x (globset 0.4.2) are not affected.
The next aho-corasick release (0.7.9) contains the fix.
See https://github.com/BurntSushi/aho-corasick/issues/53 for more context.
Reported by: yns88
Reviewed By: DurhamG
Differential Revision: D20125697
fbshipit-source-id: 592375b43d7ee494bb3e916a1cb11c18f9ebe425
Summary:
Migrate away from some uses of revision numbers.
Some dead code in discovery.py is removed.
I also fixed some test issues when I run tests locally.
Reviewed By: sfilipco
Differential Revision: D20155399
fbshipit-source-id: bfdcb57f06374f9f27be51b0980652ef50a2c8e0
Summary: This makes it possible to use NameIter in py_class.
Reviewed By: sfilipco
Differential Revision: D20020529
fbshipit-source-id: b9147b7dccb38d18d8361b420507fcbe97e01351
Summary: This will be used by commit hash prefix lookup.
Reviewed By: sfilipco
Differential Revision: D20020523
fbshipit-source-id: f2905ddf63098704b08dad8eb48272c3ffba7e25
Summary: Export common types at the top-level of the crate so it's easier to use.
Reviewed By: sfilipco
Differential Revision: D20020526
fbshipit-source-id: e9a0a8bc3cc91f81d0bc74e7530cd4613fc1dd61
Summary: Those just delegate to IdDag for the actual calculation.
Reviewed By: sfilipco
Differential Revision: D20020522
fbshipit-source-id: 272828c520097c993ab50dac6ecc94dc370c8e8b
Summary: This will be used to produce NameSet.
Reviewed By: sfilipco
Differential Revision: D20020519
fbshipit-source-id: abf6d73f2b985b74560d6b5db2800ff25450f02e
Summary: DagSet's SpanSet has fast paths for set operations. Use them.
Reviewed By: sfilipco
Differential Revision: D19912104
fbshipit-source-id: 24b55aa14d03be2f1be59c923e0b8e79d6bcbe6d
Summary: This is similar to hg's fullreposet. It'll be useful as a dummy "subset".
Reviewed By: sfilipco
Differential Revision: D19912108
fbshipit-source-id: 33a95bcb3cf5931a431a1201d1a1f3c627cec7a1
Summary: SortedSet is a wrapper to other sets that marks it as topologically sorted.
Reviewed By: sfilipco
Differential Revision: D19912111
fbshipit-source-id: 2637e8fd29b97f6db0c5bae3f0decd7ac382eeb1
Summary:
Wraps SpanSet + IdMap so it only exposes commit names without ids.
There is no equivalent smartset in Mercurial.
Reviewed By: sfilipco
Differential Revision: D19912112
fbshipit-source-id: 0d257de11527dfa8836065ac94f652730a97a468
Summary: Similar to Mercurial's smartset.baseset. All names are statically known.
Reviewed By: sfilipco
Differential Revision: D19912105
fbshipit-source-id: e4fcf2d59291adb3ca01b3b90f1ac32c65ad7eaa
Summary:
This is an example about how to use the new Bytes type. The performance change
is not obviously visible in benchmarks since the bottleneck is not at the bytes
copying.
Reviewed By: DurhamG
Differential Revision: D19818720
fbshipit-source-id: a431ae206cfa4fa08b2e162a48b3d7cbcd900f7f
Summary: The APIs are compatible so the switch is straightforward.
Reviewed By: DurhamG
Differential Revision: D19818713
fbshipit-source-id: 504e9149567c90eb661804e0dad20580a401aa76
Summary:
D20042045 changes the meaning of "lag_threshold". Update the value in mutation
store accordingly.
Reviewed By: DurhamG
Differential Revision: D20043116
fbshipit-source-id: 154e6dc2aa88ab0a9a9b21929ae5fa6163dcd403
Summary:
Previously indexes are only updated at `sync()` time. This diff makes it so
`open()` can also update lagging indexes. This should make index migration
(ex. D19851355) smoother - indexes are built in time and users suffer less from
the absent of indexes.
Reviewed By: DurhamG
Differential Revision: D20042046
fbshipit-source-id: 20412661a0ca4f5f67b671137c47b6373a42981d
Summary: The logic is currently only used by `sync()`. I'd like to reuse it at `open()`.
Reviewed By: DurhamG
Differential Revision: D20042044
fbshipit-source-id: 5c9734ff68bdcf8f8c8710c6a821b18d3afeaca0
Summary:
This is more friendly for indexedlog users - deciding lag_threshold by number
of entries is easier than by bytes.
Initially, I thought checking `bytes` is cheaper and checking `entries` is more
expensive. However, practically we will have to build indexes for `entires`
anyway. So we do know the number of entries lagging behind.
Reviewed By: DurhamG
Differential Revision: D20042045
fbshipit-source-id: 73042e406bd8b262d5ef9875e45a3fd5f29f78cf
Summary:
This can be useful for users of indexedlog when they want `Bytes` (to get rid
of the lifetime parameter).
This might be useful for storage layer that wants to take the ownership of the
returned bytes.
Reviewed By: xavierd
Differential Revision: D19818714
fbshipit-source-id: cb2d4e7deff921915e07454fee15cb94a3d5c00d
Summary: Those utilities are no longer necessary since the new code uses Bytes.
Reviewed By: xavierd
Differential Revision: D19818717
fbshipit-source-id: 0b43af0f1eae1a4288e84d4170db058b27f80334
Summary: This simplifies the code a bit and makes it cheaper to clone the Log.
Reviewed By: xavierd
Differential Revision: D19818716
fbshipit-source-id: bbf07b8b36009d53b63d8066ec422fc3c3796840
Summary: It's no longer used since Index now has inlined its checksum logic.
Reviewed By: ikostia
Differential Revision: D19850744
fbshipit-source-id: eb134e4c1613573a2d238710b44ad8119c80a5ee
Summary:
Change index filename and metadata name. This makes sure the new format and old
format are separate so upgrading or downgrading won't have issues.
Reviewed By: DurhamG
Differential Revision: D19851355
fbshipit-source-id: 25dee018073a90040f5818b32b753a3f589c10e0
Summary:
Enhance the index format: The Root entry can be followed by an optional
Checksum entry which replaces the need of ChecksumTable.
The format is backwards compatible since the old format will be just
treated as "there is no ChecksumTable", and the ChecksumTable will be built on
the next "flush".
This change is non-trivial. But the tests are pretty strong - the bitflip test
alone covered a lot of issues, and the dump of Index content helps a lot too.
For the index itself without ".sum", checksum, this change is bi-directional
compatible:
1. New code reading old file will just think the old file does not have the
checksum entry, similar to new code having checksum disabled.
2. Old code will think the root+checksum slice is the "root" entry. Parsing
the root entry is fine since it does not complain about unknown data at the
end.
However, this change dropped the logic updating ".sum" files. That part is an
issue blocking old clients from reading new data.
Reviewed By: DurhamG
Differential Revision: D19850741
fbshipit-source-id: 551a45cd5422f1fb4c5b08e3b207a2ffe3d93dea
Summary:
To solve the soundness issue of ChecksumTable raised by the last diff.
I plan to move Checksum logic to Index. This has multiple benefits:
- Solve the soundness issue of ChecksumTable.
- Indexedlog no longer writes the ".sum" files. `atomic_write` can be quite
slow (tens of milliseconds) on Windows. So this should help perf - with
many indexes, it can save hundreds of milliseconds on Windows per
indexedlog sync.
This diff adds the definition and serialization of the new Checksum entry.
The index format is not updated yet.
Reviewed By: markbt
Differential Revision: D19850742
fbshipit-source-id: df6e6ed12a12ef0d2a782dc9d6b4dc5dec3f4b46
Summary:
With the last change, mmap cost is reduced, but ChecksumTable is unsound in a
corner case: the buffer to check is shorter than what ChecksumTable covers:
checksum: |----chunk----|----chunk----|----chunk--|
buf: |-------------------------------| |
^ ^
logic len physical len
The checksum table will be unable to verify the last chunk, since it does not
have enough data in buf.
The issues is exposed by stress testing the multithread sync tests. It's not
always easy to reproduce, though.
Reviewed By: markbt
Differential Revision: D19850745
fbshipit-source-id: a1a96080163b7b9b56dcd6c1673d5d8d10e18a2b
Summary: This avoids some extra mmap syscalls by ChecksumTable.
Reviewed By: xavierd
Differential Revision: D19818721
fbshipit-source-id: dace55193f2b4b0f35e3868781faa2d2998d3b58
Summary:
This simplifies the code a bit (no special cases about 0-sized mmap buffers)
and makes it cheaper to clone the index buffer (just an Arc::clone, without
another mmap syscall).
Reviewed By: xavierd
Differential Revision: D19818718
fbshipit-source-id: e96d42af74c7f0bb11703c5da31cdfbd5d76c372
Summary:
TreeSpans used to use `&str`, which adds a lifetime to the struct, making it
harder to be used in the Python land. Use a type parameter so TreeSpans<String>
can be used.
Reviewed By: DurhamG
Differential Revision: D19797708
fbshipit-source-id: c66429abfaf16d876151ca6f29da976bed91485d
Summary:
The filtering interface allows callsite to select what they want. It's similar
to manifest walk with files or directory matchers in source control.
Reviewed By: DurhamG
Differential Revision: D19784467
fbshipit-source-id: 5cf6e4016d6fa1c90f8aeccc50809baccd4af5ab
Summary: The idea is that instants (events) can be a drop-in replacement for `ui.log`.
Reviewed By: DurhamG
Differential Revision: D19782897
fbshipit-source-id: 795bbba23d921e460f723f19ef529b203aea366a
Summary: This function will be reused by the next diff.
Reviewed By: DurhamG
Differential Revision: D19782895
fbshipit-source-id: 1e636eabee9b0dffd287a1e6784a24ab2259f51f
Summary: This allows us to define methods on the treespans, such as filtering APIs.
Reviewed By: DurhamG
Differential Revision: D19782896
fbshipit-source-id: 2e7bd8344c0196e382728c26a8233abf944bbf29
Summary: The Thrift generated code depends only on futures 0.3, not 0.1. Thus it isn't necessary to depend on renamed:futures-preview and we can depend on futures-preview directly, which is exposed to Rust code as `futures::`.
Reviewed By: jsgf
Differential Revision: D20145921
fbshipit-source-id: 5cae94ec6747a374c2bf05f124ab237c798de005
Summary:
This new method returns the content of a blob without the copy-from metadata
header.
Reviewed By: DurhamG
Differential Revision: D20102889
fbshipit-source-id: e96f636b7d30460b59707a2cb700d667e616116a