Summary:
The OpenOptions allow for multiple indices to be added, but lookup had no way
to querying these multiple indices.
Reviewed By: quark-zju
Differential Revision: D20445627
fbshipit-source-id: 0cb754ba17b452d892b7bcb56d502d5753ef963a
Summary:
This type can either be a Mercurial type key, or a content hash based key. Both
the prefetch and get_missing now can handle these properly. This is essential
for stores where data can either be fetched in both ways or when the data is
split in 2. For LFS for instance, it is possible to have the LFS pointer (via
getpackv2), but not the actual blob. In which case get_missing will simply
return the content hash version of the StoreKey, to signify what it actually
has missing.
Reviewed By: quark-zju
Differential Revision: D20445631
fbshipit-source-id: 06282f70214966cc96e805e9891f220b438c91a7
Summary:
Similarly to the DataStore trait, this makes it easier to understand that they
deal with a Mercurial type Key.
Reviewed By: quark-zju
Differential Revision: D20445621
fbshipit-source-id: a1143d5f5d6a2c8686d517a6ea3c25b07c0df072
Summary: This makes it clear that these traits are dealing with Mercurial Key.
Reviewed By: quark-zju
Differential Revision: D20445626
fbshipit-source-id: d5acbf442e9407b973e95e40af69b5a61bff0a4d
Summary:
Since configparser enforces utf-8 config files (because pest wants Rust strings),
let's migrate from Bytes to Text to remove extra encoding conversions.
Previously this was blocked by the lack of ref-counted text (since the "source"
of each config location is the entire config file). Now minibytes provides Text
so we can use it.
This unfortunately requires dependent code to be updated. The pyconfigparser
interface is in theory wrong - it shouldn't return utf-8 bytes but
local-encoded bytes. I think it's cleaner to make pyconfigparser unaware of
HGENCODING, so I changed pyconfigparser to use unicode, and add compatibility
layer in uiconfig.py.
This also fixes non-ascii encoding issues on user name (especially on Windows).
The hgrc config file should be in utf-8 and the config parser returns explicit
unicode types, and Python code round-trip them with local encodings.
Reviewed By: markbt
Differential Revision: D20432938
fbshipit-source-id: b1359429b8f1c133ab2d6b2deea6048377dfeca1
Summary:
This makes it easier to further migrate to `Text` interface.
Dependent crate (`auth`) is updated.
Reviewed By: markbt
Differential Revision: D20432941
fbshipit-source-id: 1dc29d52c9b17ce14676ef0555470c6d36a09c2b
Summary:
Text is a reference-counted shared String.
It's similar to Bytes but works for utf-8 strings.
The motivation is to replace configparser's use of Bytes to Text.
Reviewed By: markbt
Differential Revision: D20432940
fbshipit-source-id: ef990255d269e60d433c6520819f60ccdcbe488f
Summary: This makes it possible to implement "Text". See the next diff.
Reviewed By: markbt
Differential Revision: D20432943
fbshipit-source-id: 94b3810ab205c260d33f57bd637e4accc3ee871d
Summary:
This makes the API easier to use.
Practically this makes it easier for configparser to migrate to minibytes.
Reviewed By: markbt
Differential Revision: D20432942
fbshipit-source-id: ad08eb118d2216054dc24c86b0b129ae82b9d17c
Summary:
Previously Rust str was serialized into bytes. To be Python 3 friendly, let's
serialize it into `str`.
Reviewed By: markbt
Differential Revision: D19797706
fbshipit-source-id: 388eb044dc7e25cdc438f0c3d6fa5a5740f22e3d
Summary:
The goal of the stack is to support "rendering" diffs for large files in scs
server. Note that rendering is in quotes - we are fine with just showing a
placeholder like "Binary file ... differs". This is still better than the
current behaviour which just return an error.
In order to do that I suggest to tweak xdiff library to accept FileContentType
which can be either Normal(...) meaning that we have file content available, or
Omitted, which usually means the file is large and we don't even want to fetch it, and we
just want xdiff to generate a placeholder.
Reviewed By: markbt, krallin
Differential Revision: D20389226
fbshipit-source-id: 0b776d4f143e2ac657d664aa9911f6de8ccfea37
Summary:
This will be used in the Python world for legacy reasons. It shouldn't be used
in new Rust node.
To use it, the name `LegacyCodeNeedIdAccess` has to be used so we can do a code
search to find all users of it.
Reviewed By: sfilipco
Differential Revision: D20367834
fbshipit-source-id: 9b93a29f1461ce24bba6f31a2bbb1f327e216c6d
Summary: This will be useful to actually sort commits.
Reviewed By: sfilipco
Differential Revision: D20367835
fbshipit-source-id: 43bc7835277af3a14ef323ce34247e0c03878dc8
Summary:
The old "AllSet" implementation is not very practical - it does not support
iteration. Practically, the "all()" set comes from the DAG. Change the "all"
concept to a hint similar to "is_topo_sorted", and update the fast path
(intersection) accordingly.
Reviewed By: sfilipco
Differential Revision: D20367837
fbshipit-source-id: fdbf370897c93058bfcab0571c1f6fa4b99b0f6b
Summary: The word "snapshot" more accurately describes its purpose.
Reviewed By: sfilipco
Differential Revision: D20367836
fbshipit-source-id: c91a0bd402fa1718b5d805beedc0e062824c53d3
Summary:
Without this:
In [3]: util.getfstype('')
IOError: [Errno 2] No such file or directory (os error 2)
And there is a code path hitting this:
File "edenscm/mercurial/util.py", line 1483, in checknlink
fstype = getfstype(os.path.dirname(testfile))
# testfile = '.'
# os.path.dirname(".") = ""
The old implementation works fine for an empty path:
In [2]: m.util.getfstype('')
Out[2]: 'eden'
So let's make the new Rust implementation consistent.
Reviewed By: xavierd
Differential Revision: D20313387
fbshipit-source-id: 258c424a3e8a796d983e20b0d4656e8e3f413706
Summary: Similar to D13982877. Try to get names like "fuse.ntfs".
Reviewed By: farnz
Differential Revision: D20313392
fbshipit-source-id: 8363d3d92843e6afb53a0003950be083034bd841
Summary:
Only keep type parameters at the top-level function.
This reduces the binary size and speeds up rustc.
Reviewed By: xavierd
Differential Revision: D20313388
fbshipit-source-id: 29d77731ff462fee1f1bb9f234601e3430198ae7
Summary: This makes the code a bit more portable.
Reviewed By: xavierd
Differential Revision: D20313389
fbshipit-source-id: 080538939fa4d2d72e5905f23ad9be987d952748
Summary:
Rename the main method to "fstype". The API has no relation with repo.
So let's rename it.
Reviewed By: xavierd
Differential Revision: D20313386
fbshipit-source-id: 80dd1231ccccfe945150b117b151bce773f0dfeb
Summary:
Since the mocked memcache is shared between the tests, we need to make sure the
keys used by the tests are different, otherwise they are just caching each
others data.
Reviewed By: ikostia
Differential Revision: D20388783
fbshipit-source-id: 0f2f926e0ffe0e52e55291e46142808ce0921288
Summary:
Some `use`s are not used on Windows. The code was also formatted using the
latest rustfmt.
Reviewed By: xavierd
Differential Revision: D20379704
fbshipit-source-id: ffadcd68e4e0440dcbd2a4e1ad8532b47a9d83e2
Summary: Similarly to the ContentStore, remove the Arc from MetadataStore.
Reviewed By: quark-zju
Differential Revision: D20376838
fbshipit-source-id: 4321600b752c919b6d9fa7bdee6f6cb7ae083b10
Summary:
The clients should use an Rc/Arc if they need the ability to clone it. This
makes it more obvious and reduces the number of pointer indirection.
Reviewed By: quark-zju
Differential Revision: D20376839
fbshipit-source-id: c56e7e8f89ab17727be621894c329e344a7f3adb
Summary:
The dag crate is designed to work with any kind of binary commit hashes (ex. bonsai,
git or hg). The only use of `types` is to convert from binary to hex. Since dag
already has its own `to_hex` logic in `VertexName`. Let's use that instead.
Reviewed By: sfilipco
Differential Revision: D20378447
fbshipit-source-id: 00ecb551ea927fdb60dd91e5e645064f23139bcd
Summary:
Recently there are some Windows-related test flakiness in . All of them are
caused by `file.persist(path)` in `atomic_write_plain` failing with
"Access Denied". Since that can be caused by Windows Anti-Virus scans or other
weird stuff, let's workaround around it using automatically retires.
Process Explorer does not provide extra information:
indexedlog-d0c6135fd7ed9ece.exe 5868 SetRenameInformationFile C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\.tmpcfDsQQ ACCESS DENIED ReplaceIfExists: True, FileName: C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\meta
A successful rename looks like:
indexedlog-d0c6135fd7ed9ece.exe 5868 SetRenameInformationFile C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\.tmpbXEVw0 SUCCESS ReplaceIfExists: True, FileName: C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\meta
Reviewed By: ikostia
Differential Revision: D20379618
fbshipit-source-id: db3e6be3d785875486f7a517df11cbf58bf65ddd
Summary:
Now that the ContentStore can automatically strip the metadata header, no need
for duplicated code in the backingstore.
Reviewed By: fanzeyi
Differential Revision: D20376812
fbshipit-source-id: e863e1cc2fcdc8b9e612a464b305fa25ceb66e13
Summary:
During `hg update`, Mercurial forks multiple processes to write files on disk
concurrently, this is done as fetching blobs from the content store, and
writing them to disk is CPU bound. Usually, threads would be the preferred way
of speeding up such process, but unfortunately, Python has GIL that severely
limit the available concurrency. So, multiple processes were chosen.
Unfortunately, the multi-process solution also brings a lot of other issues,
more recently, we've had cases where the connections to the server and memcache
had to be dropped after the fork. In some other cases, this caused deadlocks.
And the solution is not effective on Windows.
Now that Mercurial is getting more and more Rust, we could instead go back to
the threads solution by using them in Rust, and have Python just push work to
them, this is exactly what this change does.
Things that are left to be done, but I wanted to get a diff out first:
- no file path audit
- no file backup
- no symlink creation
- probably other things I'm missing
Reviewed By: quark-zju
Differential Revision: D20102888
fbshipit-source-id: d47829fd7818b97710586b9851880f178048e27b
Summary:
With this new store, blobs will be transparently written to either an LFS
store, or a non-LFS one, depending on their size.
Initially, and as long as getpackv2 is supported, we also need to support
parsing lfs pointer data that the server is sending and write these to the lfs
pointer store. This code is very adhoc and does manual parsing of the pointer
data, definitively not great, suggestion for a simple and better solution is
welcome :).
From a migration standpoint, the read-only LFS stores are added to the
ContentStore, this allows blobs written in it to be readable at all time even
when `remotefilelog.lfs` isn't set. The code will effecitvely be dormant for a
while until the option is turned on, if we need to disable it, the dormant code
will still be able to read all the blobs written to disk. This forces us to
deploy a release that contains this code to stable first, before setting
`remotefilelog.lfs`.
Reviewed By: quark-zju
Differential Revision: D19986878
fbshipit-source-id: 260f5a542d52e748c0c703bfa7bb8ffac0e7b388
Summary: This makes `RUST_LOG` work for indexedlog tests.
Reviewed By: xavierd
Differential Revision: D20286515
fbshipit-source-id: ff4a1476eb01a9067dabe3622fd598f65fe86a18
Summary:
The tracing / env_logger integration works for hg as a binary. However I'd also
like to use it in library tests. This crate makes it easier to do so.
Reviewed By: xavierd
Differential Revision: D20286507
fbshipit-source-id: f5bf3288ce950591ddfe64b524ad51ce21ee4099
Summary: Those has helped me debugging some issues.
Reviewed By: xavierd
Differential Revision: D20286513
fbshipit-source-id: 012ddb16c2d0efd8f8697a5ecd4564ea31d65630
Summary: Move the scope of spans so the exit code is shown.
Reviewed By: xavierd
Differential Revision: D20286516
fbshipit-source-id: f39cbf60c86ea19a1bb0a09958748f04ff6a42e8
Summary:
Previously env_logger is only initialized if Python is initialized.
This diff makes env_logger initialized for Rust native commands.
Reviewed By: xavierd
Differential Revision: D20286517
fbshipit-source-id: 18fee96c2b41db1da9648d615d1e18809de90a63
Summary:
This means crates like env_logger (which reads $RUST_LOG, and writes to stderr)
can be used for convenient debugging.
Reviewed By: xavierd
Differential Revision: D20286514
fbshipit-source-id: e3b80cc4830ba5cc6dbf7aa1cbb92a4f4f046a54
Summary:
Those metadata include module_path, target, line number, etc, in Rust native
format. They will be used for the upcoming `log` integration.
Reviewed By: xavierd
Differential Revision: D20286510
fbshipit-source-id: 27019b941bef08c0bb3e505bbdae642282dcb141
Summary:
Spliting lock file acquisition from `IdDag::prepare_filesystem_sync` to its own
function.
Useful when looking ahead to split IdDag from IndexedLog.
Reviewed By: quark-zju
Differential Revision: D20316443
fbshipit-source-id: a0fd43439730376920706bb4349ce497f6624335
Summary:
This removes an inline use of the indexedlog indexes.
This is going to be useful when we try to separate IndexedLog specifics from
IdDag functionality.
Reviewed By: quark-zju
Differential Revision: D20316058
fbshipit-source-id: 942a0a71660bb327376c81fd3ac435d002ecca6e
Summary:
Instead of returning `anyhow::Error` wrapping an `ErrorKind` enum
from each Thrift client method, just return an error type specific
to that method. This will make error handling simpler and less
error-prone by removing the need to downcast the returned error.
This diff also removes the `ErrorKind` enums so that we can be sure
that there are no leftover places trying to downcast to them.
(Note: this ignores all push blocking failures!)
Reviewed By: dtolnay
Differential Revision: D20260398
fbshipit-source-id: f0dd96a7b83dd49f6b30948660456539012f82e6
Summary:
The old code does "read, lock, write", which is unsound because after "lock"
the data just read can be outdated and needs a reload.
Reviewed By: xavierd
Differential Revision: D20306137
fbshipit-source-id: a1c29d5078b2d47ee95cf00db8c1fcbe3447cccf
Summary:
I thought the index function could be the bottleneck. However, the Log reading
(xxhash, decoding vlqs) can be much slower for very long entries. Therefore
using bytes as the lag threshold is better. It does leaked the Log
implementation details (how it encodes an entry) to some extend, though.
Reverts D20042045 and D20043116 logically. The lagging calculation is using
the new Index::get_original_meta API, which is easier to verify correctness
(In fact, it seems the old code is wrong - it might skip Index flushes if
sync() is called multiple times without flushing).
This should mitigate an issue where a huge entry (generated by `hg trace`) in
blackbox does not get indexed in time and cause performance regressions.
Reviewed By: DurhamG
Differential Revision: D20286508
fbshipit-source-id: 7cd694b58b95537490047fb1834c16b30d102f18
Summary: This will be used to more reliably detect index lags.
Reviewed By: DurhamG
Differential Revision: D20286518
fbshipit-source-id: c553b6587363a55603b75df12580588e3100e35f
Summary:
This ensures indexes are complete even if index format or definition has been
changed.
Reviewed By: DurhamG
Differential Revision: D20286509
fbshipit-source-id: fcc4ebc616a4501e4b6fd2f1a9826f54f40b99b8
Summary:
This avoids loading all blackbox logs when `init()` gets called multiple times
(for example, once in Rust and once in Python).
Reviewed By: DurhamG
Differential Revision: D20286511
fbshipit-source-id: ef985e454782b787feac90a6249651a882b6552e
Summary: This API has the benefit that it does not trigger loading older logs.
Reviewed By: DurhamG
Differential Revision: D20286512
fbshipit-source-id: 426421691ad1130cdbb2305612d76f18c9f8798c
Summary:
With the new crate-public interfaces and Debug implementations it's possible to
write tests for DagSet. So let's do it.
Reviewed By: sfilipco
Differential Revision: D20242561
fbshipit-source-id: 180e04d9535f79471c79c4307f6ab6e8e8815067
Summary:
Don't restrict constructing a c_api datapack store to only Unix, we can
construct it on Windows too by assuming that their path will be valid UTF-8.
Reviewed By: quark-zju
Differential Revision: D20250718
fbshipit-source-id: 07234b6a71b50c803cfe3b962fa727f57037c919