Summary:
This new method returns the content of a blob without the copy-from metadata
header.
Reviewed By: DurhamG
Differential Revision: D20102889
fbshipit-source-id: e96f636b7d30460b59707a2cb700d667e616116a
Summary:
The NameSet is something similar to SpanSet and Mercurial's smartset but speaks
VertexNames instead of Ids. The idea is, NameSet will be part of NameDag APIs,
and potentially replace Mercurial's smartset layer (just smartset the container
types, not the revset language), in a way that revision numbers are completely
hidden behind the scenes.
This diff adds some basic abstraction around iteration-related operations.
Other operations will be added later.
Reviewed By: sfilipco
Differential Revision: D19912109
fbshipit-source-id: 504a26c074282ec51f260535ca63e943124f688e
Summary:
Update the `print_status()` function to take a `clidispatch::io::IO` object as
a parameter, instead of a simple output object. This will allow us to also
print error messages from this function in a future diff.
Reviewed By: quark-zju
Differential Revision: D19958504
fbshipit-source-id: bf482fdc4420e1350363a730c6a539cd760aef25
Summary:
Fix the PathRelativizer APIs to accept `Path` and even `str` arguments instead
of just `PathBuf`. The old code required a `PathBuf`, which often forced
callers to make a copy of the path data.
Reviewed By: quark-zju
Differential Revision: D19958505
fbshipit-source-id: 6fa40dd4b75df4e3faf9ad2ae4f0e4e6595669f6
Summary:
The bytes 0.5 is a depencency of newer tokio, it's also newer, and thus better.
Staying on 0.4 means that copies between Bytes 0.4 and 0.5 need to be done,
this will be especially bad in the LFS code since 10+MB buffer will have to be
copied...
One main API change is for the configparser. The code used to take Into<Bytes>
for the keys, I switched it to AsRef<[u8]>.
For hg_memcache_client, an extra copy is performed to build a Delta, since this
code uses an old tokio, and is being replaced right now, the effort of
switching to a new tokio and new bytes was not deemed worth it, the copy will
do for now.
Reviewed By: dtolnay
Differential Revision: D20043137
fbshipit-source-id: 395bfc3749a3b1bdfea652262019ac6a086e61e0
Summary:
Mercurial filenode hash is computed by including the copy information in the
blob header. Before computing the blob content hash, or returning it to the
upper layers, we need to either strip or reconstruct this header appropriately.
Reviewed By: DurhamG
Differential Revision: D19975887
fbshipit-source-id: 7555e7219e50f4d18ec677fdecc216ee705d7af4
Summary: This will make it easier to support more hash schemes in the future.
Reviewed By: DurhamG
Differential Revision: D19975888
fbshipit-source-id: 8b8ce3b20d72199bac3cd20a48475b5ab56bfc52
Summary:
With the Arc embedded into the store themselves, this forces a second
allocation in order to use them as trait objects. Since in most cases, we do
not want the stores themselves to be cloneable, we can move the Arc outside and
thus reduce the number of pointer indirection.
Reviewed By: DurhamG
Differential Revision: D19867568
fbshipit-source-id: 9cd126831fe2b9ee715472ac3299b7a09df95fce
Summary:
The ContentStore now can read LFS blobs from both the shared cache, and the
local store.
Reviewed By: DurhamG
Differential Revision: D19866249
fbshipit-source-id: a6fb3523495e9d3832613b56438f631cfa552b91
Summary:
With the LFS store being added, and the indexedlog being soon used for trees,
this simplification should help in formalizing the hierarchy of files/folders.
It will look like the following:
<root dir>/lfs: for the lfs store
<root dir>/indexedlog*: for the indexedlog
<root dir>/foobar: for a hypothetical foobar store
For manifests, <root dir> will therefore be: <store dir>/manifests. The
unfortunate part is that the current tree data lives under
<store dir>/packs/manifests. As packfiles will be replaced, this small
discrepency is acceptable.
Reviewed By: DurhamG
Differential Revision: D19866248
fbshipit-source-id: 7ef59ef7df19149b19a529b4f4a45a479cc9d23b
Summary:
This is the first step in having a stronger integration between LFS blobs and
the ContentStore abstraction. The 2 main difference between the Python based
LFS implementation and this one are:
- pointers are not stored alongside plain data,
- blobs are split between local and shared blobs
As of now, no reclamation is being performed for shared blobs, blobs aren't
fetched or uploaded. This will come in future diffs.
Reviewed By: DurhamG
Differential Revision: D19859291
fbshipit-source-id: 45000fc574e6fbd6d3487f4966cad4f49dab731c
Summary:
Update the C files under edenscm/mercurial/cext to use absolute includes from
the repository root. Also update a few of the libraries in edenscm/mercurial
that the cext code depends on.
This makes these files easier to build with Buck in fbsource, and reduces the
number of places where we have to use deprecated Buck functionality to help
find these headers. This also allows autodeps to work with the build targets
for these rules.
Reviewed By: xavierd
Differential Revision: D19958221
fbshipit-source-id: e6e471583a795ba5773bae5f16ed582c9c5fd57e
Summary: Generate the Cargo.toml files inside xdiff with autocargo. This will enable Mononoke to depend on this code easily without sacrificing anything on eden/scm side.
Reviewed By: aslpavel
Differential Revision: D19948741
fbshipit-source-id: 905ff3d64b90830e5f075e4c6ed2b3de959e3f00
Summary: This will be used in the LFS store.
Reviewed By: DurhamG
Differential Revision: D19895803
fbshipit-source-id: 4cf447987c10fed0b5c98904f20c841428965d89
Summary:
In some cases, higher level stores may want to store data in either a plain
IndexedLog, or in a RotateLog, for local and shared data. Due to slight
difference between the 2, they can't easily be adapted into a common trait.
Instead let's just wrap both into an enum and implement the main functions that
the higher level stores need.
The first use of this will be the LfsStore, future use will include the
IndexedLogDataStore and the IndexedLogHistoryStores.
Reviewed By: DurhamG
Differential Revision: D19859292
fbshipit-source-id: 920572e0cf5f69bda4901a727a6b0dc0f08fc8d0
Summary:
When I run make local it's creating changes in our checked in thrift
types. I guess I need to check these in?
Reviewed By: quark-zju
Differential Revision: D19848706
fbshipit-source-id: 8a2e9a2617734eda41eade1f2645689362b1d75d
Summary:
Up to now, this has been done in chef, and thus for repos that we do not list,
they may share the memcache keys, with potential unintended consequences. Let's
always add the repo name to the key, so we can simplify the code in chef.
One small negative effect of this change is that while it is being rolled out,
the cache hit rate will be impacted. This should resolve itself quickly.
Reviewed By: DurhamG
Differential Revision: D19885775
fbshipit-source-id: 0b59ce9e378b0ab70f696a39d19d27cd89921098
Summary:
Failing means that we fallback to the Python importer. Let's simply warn about
it.
Reviewed By: fanzeyi
Differential Revision: D19897274
fbshipit-source-id: f9c63f5aa76015c28b31f00bba98244f5c86e923
Summary:
This makes it possible to use `Bytes` for mmap buffers.
The changes are because `minibytes::Bytes` does not implement `From<&[u8]>`
with the intention to make slice copy explicit.
Reviewed By: xavierd
Differential Revision: D19818719
fbshipit-source-id: c34ee451bfd2dc7bcbbcebd52a76444b6c236849
Summary:
EdenFS will now be able to fetch blobs directly from memcache. This won't have
any big benefits as no blobs are in memcache right now, but over time, this
will significantly reduce the cost of fetching blobs.
Reviewed By: fanzeyi
Differential Revision: D19861643
fbshipit-source-id: c2e9d317bd30d4656bf0b3f8897794161697761a
Summary:
These tracing points will help us understand the memcache hit rate as well as
the fetching speed.
Reviewed By: quark-zju
Differential Revision: D19836499
fbshipit-source-id: 1936c44efc3e7715069e6a959f5331139d591d5c
Summary:
Everytime a cache miss is seen, the data fetched from the server will be sent
directly to memcache for future use. Unfortunately, doing so in a blocking
manner severely impact the overall fetching speed from the server. Since
memcache is purely an optimization, we can afford to send data to it
asynchronously.
Let's move as much as possible of the code to a background thread to reduce the
overhead of memcache.
Reviewed By: DurhamG
Differential Revision: D19836011
fbshipit-source-id: 68e506ef7464d6e99d98457d0d37178f514be1a9
Summary:
Instead of fetching data one-by-one, let's prefetch data concurrently by using
the new get_iter function.
Reviewed By: DurhamG
Differential Revision: D19836009
fbshipit-source-id: 4a50328c0cbbba677c2de3777ebe4c34cb10c1e2
Summary:
Even when memcache would be able to prefetch everything, this would always call
into the underlying remote store with an empt key set. For things like `hg
prefetch` and a large number of keys, the effect of doing that is minimum, but
for EdenFS or `hg log -p`, the roundtrip to the server for every file/revision
would add a significant amount of overhead. Let's simply stop iterating when we
no longer need to fetch anything.
Reviewed By: DurhamG
Differential Revision: D19835797
fbshipit-source-id: 54ad704428c3b20d973cfa87f7171899ec44b3f9
Summary:
See also https://github.com/serde-rs/bytes/.
This will be used in the `dag` crate.
Reviewed By: DurhamG
Differential Revision: D19770858
fbshipit-source-id: 2a870a564e0ceecdc7a4667853b2b2a5ea4ce6e3
Summary:
This crate provides the core features of the commonly known `Bytes` crate:
zero-copy slicing and cloning, while also supports mmap-backed buffers.
The main motivation is to replace `Mmap` in `indexedlog`. That has multiple
benefits:
- Handles 0-sized mmap more cleanly.
- Handles clones more cleanly.
- Gain the flexibility to zero-copy data without lifetime / reference.
- Gain the flexibility to switch to non-mmap data.
The `bytes::Bytes` crate does not yet support mmap buffers as of its latest
release (0.5.4).
Implementation wise, `minibytes::Bytes` uses `Option<Arc<dyn Trait>>` for the
"trait object". This makes implementing the mmap storage just one line.
`bytes 0.5.4` re-invents the "trait object" manually using unsafe code. It requires
about 50 lines to implement the mmap storage (in D19756122).
Reviewed By: xavierd
Differential Revision: D19770856
fbshipit-source-id: 8cfa7052a18ac2e0cd6348b77d5e2a4acc61195c
Summary: This makes the output more readable even if the "name" of a span is very long.
Reviewed By: DurhamG
Differential Revision: D19780536
fbshipit-source-id: dce0d3777409c32b0752db51341a572addb823ea
Summary:
As initializing the memcache client takes ~0.7s, let's move it to a background
thread as to not impact Mercurial startup time. This diff uses ArcSwap in
order to reduce the overhead of the very common read paths as much as possible.
Using Mutex or RwLock instead would have caused unecessary contention.
Reviewed By: DurhamG
Differential Revision: D19518693
fbshipit-source-id: 886e9b86813fda6ff005ccce99659890026f643a
Summary:
This allows the Python code to build a memcache client and build ContentStore
and MetadataStore with it.
Reviewed By: DurhamG
Differential Revision: D19518694
fbshipit-source-id: d932fd5223ccfdf37db69cbb54a11a6571312709
Summary:
This enables an in-process memcache client for the Rust
ContentStore/MetadataStore. For now, this implementation is lacking several
necessary optimization:
- Start-up time is always slowed down by ~0.7s, the initialization will be
moved to a background thread
- Writing data to memcache is blocking and will be moved to a background
thread too.
- Prefetching data does a roundtrip to memcache for every key, batching
memcache APIs will be added.
Compared to the existing hg_memcache_client, this implementation is both
significantly shorter and do not exhibit some of the pathological behavior of
having to flush the indexedlog for every fetched blob when used in Eden.
Reviewed By: DurhamG
Differential Revision: D19518696
fbshipit-source-id: 4725447d13e7eddd9586135c2511e13ddb921771
Summary:
The library already has a way to wrap a Python object into a Rust object that
exposes the Rust Read/Write interface. This is the reverse direction for
the Write interface.
The initial intention is to expose Rust stdout as described in D19702533.
However, I found Python's `sys.stdout.buffer` also enforces utf-8 encoding
on Windows (unless PYTHONLEGACYWINDOWSSTDIO is set). So Python's
stdout actually behaves similarly with Rust's stdout on Windows and is okay
to use. That said, it's still useful to have this abstraction, for streampager [1]
integration.
[1]: https://github.com/markbt/streampager/
Reviewed By: sfilipco
Differential Revision: D19716127
fbshipit-source-id: ba39898122561d9a49b7080ee95d7c940540eb40
Summary:
Drop stdoutbytes/stdinbytes. They make things unnecessarily complicated
(especially for chg / Rust dispatch entry point).
The new idea is IO are using bytes. Text are written in utf-8 (Python 3) or
local encoding (Python 2). To make stdout behave reasonably on systems not
using utf-8 locale (ex. Windows), we might add a Rust binding to Rust's stdout,
which does the right thing:
- When writing to stdout console, expect text to be utf-8 encoded and do proper decoding.
- Wehn writing to stdout file, write the raw bytes without translation.
Note Python's `sys.stdout.buffer` does not do translation when writing to stdout console
like Rust's stdout.
For now, my main motivation of this change is to fix chg on Python 3.
Reviewed By: xavierd
Differential Revision: D19702533
fbshipit-source-id: 74704c83e1b200ff66fb3a2d23d97ff21c7239c8
Summary:
All diff functions are (bytes, bytes) -> bytes to preserver
the original file encoding.
Because of that I had to add ui.writebytes output function that accepts
bytes for terminal output.
Reviewed By: farnz
Differential Revision: D19656673
fbshipit-source-id: b9a1e4361e825fc8c2313e8402c2bbe00f490dd4
Summary: `PyPath` is to `PyPathBuf` as `Path` is to `PathBuf` and `str` is to `String`.
Reviewed By: quark-zju
Differential Revision: D19647995
fbshipit-source-id: 841a5f6fea295bc72b00da028ae256ca38578504
Summary: Username as utf8, so let's make mutationmarker treat them as such.
Reviewed By: xavierd
Differential Revision: D19649887
fbshipit-source-id: 3f8b2db434a57ee8ee3017de8d925c19a2002b20
Summary: Add `PyNone`. This is a marker struct that indicates that a python function can only return `PyNone`.
Reviewed By: xavierd
Differential Revision: D19644338
fbshipit-source-id: f846b146237ebf7de996177494934fec662cde0f
Summary:
`PyPath` is the type that owns the data. Rename it to `PyPathBuf` for analogy
with `PathBuf` and `RepoPathBuf`, and to allow us to introduce a reference type
named `PyPath`.
Reviewed By: xavierd
Differential Revision: D19643797
fbshipit-source-id: 56d80fea5677f7223e967b0723039d1763b26f68
Summary:
Make the type easier to use. Namely, the treestate bindings want PyPath <->
bytes since treestate internally uses bytes.
Reviewed By: xavierd
Differential Revision: D19635357
fbshipit-source-id: 37d1889b5da1d7f3869bb7820de0219b87b71a8b
Summary:
Set up the `cpython-ext` and `hgcommands` libraries so that they can compile
against py2 and py3 versions of rust-cpython. Make py2 the default so
that cargo test still works.
Reviewed By: singhsrb
Differential Revision: D19615656
fbshipit-source-id: 3403e7077deb3c0a9dfe0e3b7d4f4ad1da73bba3
Summary:
Update the Rust hgcommands code to pass the command line arguments into the
Python logic as `Str` types, so that this will be Unicode `str` objects when
using Python 3.
Reviewed By: xavierd
Differential Revision: D19596739
fbshipit-source-id: 7cdfd44a1c4ce8b0f86d20b634d9b27eab822b2d
Summary: This is incorrectly removed due to a bad rebase / merge.
Reviewed By: DurhamG
Differential Revision: D19607801
fbshipit-source-id: a6ee7a3f184ff1882eb1f1513f7fed74a7108727
Summary:
This makes `make hg3` work. It requires cleaning up the `build` directory when
switching between py2 and py3 build, which will be fixed later.
Reviewed By: DurhamG
Differential Revision: D19604824
fbshipit-source-id: 060ff313420126a5dba935c4451b45dc9af45f13
Summary: Add methods to convert to a `Str` object from `String` and from `Vec[u8]`
Reviewed By: xavierd
Differential Revision: D19596743
fbshipit-source-id: 6499f7f1b8329f4d14ce8179a41ed46982a85c8e
Summary: Update to the new version of rust-cpython. This supports `list.append`, so make use of it.
Reviewed By: xavierd
Differential Revision: D19590905
fbshipit-source-id: 03609d4f698ae8e4380e82b8144caaa205b4c2d4