Commit Graph

325 Commits

Author SHA1 Message Date
Xavier Deguillard
f466037b4b revisionstore: fix memcache test flakiness
Summary:
Since the mocked memcache is shared between the tests, we need to make sure the
keys used by the tests are different, otherwise they are just caching each
others data.

Reviewed By: ikostia

Differential Revision: D20388783

fbshipit-source-id: 0f2f926e0ffe0e52e55291e46142808ce0921288
2020-03-11 15:58:03 -07:00
Jun Wu
97e9b81ba5 indexedlog: remove compiler warnings on Windows
Summary:
Some `use`s are not used on Windows. The code was also formatted using the
latest rustfmt.

Reviewed By: xavierd

Differential Revision: D20379704

fbshipit-source-id: ffadcd68e4e0440dcbd2a4e1ad8532b47a9d83e2
2020-03-11 15:54:19 -07:00
Xavier Deguillard
c98b9cfff9 revisionstore: remove Arc from MetadataStore
Summary: Similarly to the ContentStore, remove the Arc from MetadataStore.

Reviewed By: quark-zju

Differential Revision: D20376838

fbshipit-source-id: 4321600b752c919b6d9fa7bdee6f6cb7ae083b10
2020-03-11 13:39:06 -07:00
Xavier Deguillard
7e704ec7fb revisionstore: remove the Arc from ContentStore
Summary:
The clients should use an Rc/Arc if they need the ability to clone it. This
makes it more obvious and reduces the number of pointer indirection.

Reviewed By: quark-zju

Differential Revision: D20376839

fbshipit-source-id: c56e7e8f89ab17727be621894c329e344a7f3adb
2020-03-11 13:39:05 -07:00
Jun Wu
4960709aa3 dag: do not depend on types
Summary:
The dag crate is designed to work with any kind of binary commit hashes (ex. bonsai,
git or hg). The only use of `types` is to convert from binary to hex. Since dag
already has its own `to_hex` logic in `VertexName`. Let's use that instead.

Reviewed By: sfilipco

Differential Revision: D20378447

fbshipit-source-id: 00ecb551ea927fdb60dd91e5e645064f23139bcd
2020-03-11 10:49:31 -07:00
Jun Wu
009ea22175 indexedlog: retry rename in atomic_write on Windows
Summary:
Recently there are some Windows-related test flakiness in . All of them are
caused by `file.persist(path)` in `atomic_write_plain` failing with
"Access Denied". Since that can be caused by Windows Anti-Virus scans or other
weird stuff, let's workaround around it using automatically retires.

Process Explorer does not provide extra information:

    indexedlog-d0c6135fd7ed9ece.exe	5868	SetRenameInformationFile	C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\.tmpcfDsQQ	ACCESS DENIED	ReplaceIfExists: True, FileName: C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\meta

A successful rename looks like:

    indexedlog-d0c6135fd7ed9ece.exe	5868	SetRenameInformationFile	C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\.tmpbXEVw0	SUCCESS	ReplaceIfExists: True, FileName: C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\meta

Reviewed By: ikostia

Differential Revision: D20379618

fbshipit-source-id: db3e6be3d785875486f7a517df11cbf58bf65ddd
2020-03-11 10:06:47 -07:00
Xavier Deguillard
5d230aef68 backingstore: use get_file_content to strip metadata
Summary:
Now that the ContentStore can automatically strip the metadata header, no need
for duplicated code in the backingstore.

Reviewed By: fanzeyi

Differential Revision: D20376812

fbshipit-source-id: e863e1cc2fcdc8b9e612a464b305fa25ceb66e13
2020-03-11 09:40:26 -07:00
Xavier Deguillard
40bbe7b4da merge: add a Rust threaded file updater
Summary:
During `hg update`, Mercurial forks multiple processes to write files on disk
concurrently, this is done as fetching blobs from the content store, and
writing them to disk is CPU bound. Usually, threads would be the preferred way
of speeding up such process, but unfortunately, Python has GIL that severely
limit the available concurrency. So, multiple processes were chosen.

Unfortunately, the multi-process solution also brings a lot of other issues,
more recently, we've had cases where the connections to the server and memcache
had to be dropped after the fork. In some other cases, this caused deadlocks.
And the solution is not effective on Windows.

Now that Mercurial is getting more and more Rust, we could instead go back to
the threads solution by using them in Rust, and have Python just push work to
them, this is exactly what this change does.

Things that are left to be done, but I wanted to get a diff out first:
 - no file path audit
 - no file backup
 - no symlink creation
 - probably other things I'm missing

Reviewed By: quark-zju

Differential Revision: D20102888

fbshipit-source-id: d47829fd7818b97710586b9851880f178048e27b
2020-03-11 01:13:54 -07:00
Xavier Deguillard
185bc0f437 revisionstore: add an LfsMultiplexer store
Summary:
With this new store, blobs will be transparently written to either an LFS
store, or a non-LFS one, depending on their size.

Initially, and as long as getpackv2 is supported, we also need to support
parsing lfs pointer data that the server is sending and write these to the lfs
pointer store. This code is very adhoc and does manual parsing of the pointer
data, definitively not great, suggestion for a simple and better solution is
welcome :).

From a migration standpoint, the read-only LFS stores are added to the
ContentStore, this allows blobs written in it to be readable at all time even
when `remotefilelog.lfs` isn't set. The code will effecitvely be dormant for a
while until the option is turned on, if we need to disable it, the dormant code
will still be able to read all the blobs written to disk. This forces us to
deploy a release that contains this code to stable first, before setting
`remotefilelog.lfs`.

Reviewed By: quark-zju

Differential Revision: D19986878

fbshipit-source-id: 260f5a542d52e748c0c703bfa7bb8ffac0e7b388
2020-03-10 18:14:54 -07:00
Jun Wu
5f84fc8222 indexedlog: use dev-logger
Summary: This makes `RUST_LOG` work for indexedlog tests.

Reviewed By: xavierd

Differential Revision: D20286515

fbshipit-source-id: ff4a1476eb01a9067dabe3622fd598f65fe86a18
2020-03-10 14:16:39 -07:00
Jun Wu
7a12c33163 dev-logger: a simple library to enable env_logger for testing
Summary:
The tracing / env_logger integration works for hg as a binary. However I'd also
like to use it in library tests. This crate makes it easier to do so.

Reviewed By: xavierd

Differential Revision: D20286507

fbshipit-source-id: f5bf3288ce950591ddfe64b524ad51ce21ee4099
2020-03-10 14:16:38 -07:00
Jun Wu
cf72dc45f5 indexedlog: add some tracing information
Summary: Those has helped me debugging some issues.

Reviewed By: xavierd

Differential Revision: D20286513

fbshipit-source-id: 012ddb16c2d0efd8f8697a5ecd4564ea31d65630
2020-03-10 14:16:38 -07:00
Jun Wu
e7ed737a64 hgcommands: make env_logger show exit code
Summary: Move the scope of spans so the exit code is shown.

Reviewed By: xavierd

Differential Revision: D20286516

fbshipit-source-id: f39cbf60c86ea19a1bb0a09958748f04ff6a42e8
2020-03-10 14:16:37 -07:00
Jun Wu
bb9023c2cb hgcommands: move env_logger initialization to hgcommands
Summary:
Previously env_logger is only initialized if Python is initialized.
This diff makes env_logger initialized for Rust native commands.

Reviewed By: xavierd

Differential Revision: D20286517

fbshipit-source-id: 18fee96c2b41db1da9648d615d1e18809de90a63
2020-03-10 14:16:37 -07:00
Jun Wu
97d0a976fd tracing: make it write to the log eco-system
Summary:
This means crates like env_logger (which reads $RUST_LOG, and writes to stderr)
can be used for convenient debugging.

Reviewed By: xavierd

Differential Revision: D20286514

fbshipit-source-id: e3b80cc4830ba5cc6dbf7aa1cbb92a4f4f046a54
2020-03-10 14:16:37 -07:00
Jun Wu
796f199130 tracing: save static metadata from tracing to Spans
Summary:
Those metadata include module_path, target, line number, etc, in Rust native
format.  They will be used for the upcoming `log` integration.

Reviewed By: xavierd

Differential Revision: D20286510

fbshipit-source-id: 27019b941bef08c0bb3e505bbdae642282dcb141
2020-03-10 14:16:36 -07:00
Stefan Filip
d8b4ddcecf dag: split lock file acquisition to own function
Summary:
Spliting lock file acquisition from `IdDag::prepare_filesystem_sync` to its own
function.
Useful when looking ahead to split IdDag from IndexedLog.

Reviewed By: quark-zju

Differential Revision: D20316443

fbshipit-source-id: a0fd43439730376920706bb4349ce497f6624335
2020-03-09 10:18:07 -07:00
Stefan Filip
620cdd96f2 dag: add IdDag::iter_segments_with_parent
Summary:
This removes an inline use of the indexedlog indexes.
This is going to be useful when we try to separate IndexedLog specifics from
IdDag functionality.

Reviewed By: quark-zju

Differential Revision: D20316058

fbshipit-source-id: 942a0a71660bb327376c81fd3ac435d002ecca6e
2020-03-09 10:18:07 -07:00
Kuba Zika
6a25dbee81 Simplify error pattern matching
Summary:
Instead of returning `anyhow::Error` wrapping an `ErrorKind` enum
from each Thrift client method, just return an error type specific
to that method. This will make error handling simpler and less
error-prone by removing the need to downcast the returned error.

This diff also removes the `ErrorKind` enums so that we can be sure
that there are no leftover places trying to downcast to them.

(Note: this ignores all push blocking failures!)

Reviewed By: dtolnay

Differential Revision: D20260398

fbshipit-source-id: f0dd96a7b83dd49f6b30948660456539012f82e6
2020-03-06 12:09:38 -08:00
Jun Wu
3103fcf62b indexedlog: reload content after obtaining a lock at open time
Summary:
The old code does "read, lock, write", which is unsound because after "lock"
the data just read can be outdated and needs a reload.

Reviewed By: xavierd

Differential Revision: D20306137

fbshipit-source-id: a1c29d5078b2d47ee95cf00db8c1fcbe3447cccf
2020-03-06 08:12:02 -08:00
Jun Wu
75e4ffc17f indexedlog: change IndexDef.lag_threshold back from entries to bytes
Summary:
I thought the index function could be the bottleneck. However, the Log reading
(xxhash, decoding vlqs) can be much slower for very long entries. Therefore
using bytes as the lag threshold is better. It does leaked the Log
implementation details (how it encodes an entry) to some extend, though.

Reverts D20042045 and D20043116 logically. The lagging calculation is using
the new Index::get_original_meta API, which is easier to verify correctness
(In fact, it seems the old code is wrong - it might skip Index flushes if
sync() is called multiple times without flushing).

This should mitigate an issue where a huge entry (generated by `hg trace`) in
blackbox does not get indexed in time and cause performance regressions.

Reviewed By: DurhamG

Differential Revision: D20286508

fbshipit-source-id: 7cd694b58b95537490047fb1834c16b30d102f18
2020-03-05 13:29:48 -08:00
Jun Wu
efff6f3592 indexedlog: add an API to get the Index meta that is not dirty
Summary: This will be used to more reliably detect index lags.

Reviewed By: DurhamG

Differential Revision: D20286518

fbshipit-source-id: c553b6587363a55603b75df12580588e3100e35f
2020-03-05 13:29:47 -08:00
Jun Wu
66e60bacb9 rotatelog: build indexes for older logs on access
Summary:
This ensures indexes are complete even if index format or definition has been
changed.

Reviewed By: DurhamG

Differential Revision: D20286509

fbshipit-source-id: fcc4ebc616a4501e4b6fd2f1a9826f54f40b99b8
2020-03-05 13:29:47 -08:00
Jun Wu
669c58bd56 blackbox: use RotateLog::iter_dirty()
Summary:
This avoids loading all blackbox logs when `init()` gets called multiple times
(for example, once in Rust and once in Python).

Reviewed By: DurhamG

Differential Revision: D20286511

fbshipit-source-id: ef985e454782b787feac90a6249651a882b6552e
2020-03-05 13:29:47 -08:00
Jun Wu
1c6310b9d6 rotatelog: add iter_dirty() API
Summary: This API has the benefit that it does not trigger loading older logs.

Reviewed By: DurhamG

Differential Revision: D20286512

fbshipit-source-id: 426421691ad1130cdbb2305612d76f18c9f8798c
2020-03-05 13:29:46 -08:00
Jun Wu
64ba669a51 nameset: add some tests for DagSet
Summary:
With the new crate-public interfaces and Debug implementations it's possible to
write tests for DagSet. So let's do it.

Reviewed By: sfilipco

Differential Revision: D20242561

fbshipit-source-id: 180e04d9535f79471c79c4307f6ab6e8e8815067
2020-03-05 11:46:18 -08:00
Xavier Deguillard
34bce8690f revisionstore: silence compiler warning
Summary:
Don't restrict constructing a c_api datapack store to only Unix, we can
construct it on Windows too by assuming that their path will be valid UTF-8.

Reviewed By: quark-zju

Differential Revision: D20250718

fbshipit-source-id: 07234b6a71b50c803cfe3b962fa727f57037c919
2020-03-05 09:35:57 -08:00
Xavier Deguillard
751fc53638 types: add an ancestors method to RepoPath
Summary: This returns the ancestors in the reverser order as the parents method.

Reviewed By: sfilipco

Differential Revision: D20265277

fbshipit-source-id: 83277cee3d8e9070fc56d20d4c1877e6782c22f7
2020-03-05 09:31:32 -08:00
Jun Wu
bb1562604a dag: make some test APIs public in crate
Summary: Those will be reused by nameset::DagSet.

Reviewed By: sfilipco

Differential Revision: D20242563

fbshipit-source-id: 944e9a04aeb15439256ecea64355b67e326e5c89
2020-03-04 17:33:25 -08:00
Jun Wu
b8e1477401 nameset: impl Debug for other sets
Summary:
This is useful for `assert_eq!(format!("{:?}", set), "...")` tests.

It will be eventually exposed to Python as `__repr__`, similar to Python's
smartsets.

Reviewed By: sfilipco

Differential Revision: D20242562

fbshipit-source-id: 5373bb180db7cafebf273ace7cf2cb80fbfb8038
2020-03-04 17:33:25 -08:00
Jun Wu
fa069204e3 nameset: impl Debug for StaticSet
Summary:
In the Python world all smartsets have some kind of "debug" information. Let's
do something similar in Rust.

Related code is updated so the test is more readable.

Reviewed By: sfilipco

Differential Revision: D20242564

fbshipit-source-id: 7439c93d82d5d037c7167818f4e1125c5a1e513e
2020-03-04 17:33:24 -08:00
Jun Wu
0ae5a59e9e indexedlog: fix metadata-only updates for Indexes
Summary:
Previously, `flush()` will skip writing the file if there are only metadata
changes. Fix it by detecting metadata changes.

This can potentially fix an issue that certain blackbox indexes are empty,
lagging and require scanning the whole log again and again. In that case,
the index itself is not changed (the root radix entry is not changed), but
only the metadata tracking how many bytes in Log the index covered
changed.

Reviewed By: sfilipco

Differential Revision: D20264627

fbshipit-source-id: 7ee48454a92b5786b847d8b1d738cc38183f7a32
2020-03-04 15:59:12 -08:00
Xavier Deguillard
314d1978ef clidispatch: silence warning on windows
Summary:
Using `if cfg!` instead of `#[cfg]` allows for the compiler to understand
that the arguments aren't unused, and silence the warnings.

Reviewed By: quark-zju

Differential Revision: D20242280

fbshipit-source-id: 332dfe17b3a80a1096d15c91c9fb6644bd10e0cd
2020-03-04 09:49:15 -08:00
Xavier Deguillard
7d9d38017c configparser: silence compiler warning
Summary:
Compiling it on Windows produced a bunch of warning due to
`hgrc_configset_load_path` not being compiled on it. Fixed it so it no longer
depends on Unix specific imports.

Reviewed By: quark-zju

Differential Revision: D20241102

fbshipit-source-id: 3002f961191fbb9bc51aa9ac1154d6d50bd7fe23
2020-03-04 09:49:14 -08:00
Xavier Deguillard
db76b7d52b procinfo: address compiler warning
Summary:
The `.into_iter()` for this object is being deprecated and won't compile in
the future, fix it now.

Reviewed By: quark-zju

Differential Revision: D20241103

fbshipit-source-id: fdee463ed81cd07a65f3cc4c70a96c88928b3b87
2020-03-04 09:49:14 -08:00
Xavier Deguillard
be7ae642ea commitcloudsubscriber: silence compiler warning
Summary:
While compiling on Windows, this file issues a bunch of warnings, use `if
cfg!` instead of `#[cfg]` to silence them. The behavior is the same, but the
later allows the compiler to recognize that some is not unused.

Reviewed By: quark-zju

Differential Revision: D20241104

fbshipit-source-id: 2cd7f171c7a2f7220cc73bea9be3359260de19b2
2020-03-04 09:49:14 -08:00
Jun Wu
49464342fd indexedlog: try to use symlink for atomic_write on unix
Summary:
The change is in theory not necessary. However it improves the reliability on
OS crashes a bit, and can potentially workaround some bugs in filesystems
(as we saw in production where the atomic-written files are empty and the
system didn't crash).

The idea is, the `symlink` syscall does the file creation and "content" writing
together, while there is no way to create a file and write specific content
in one syscall. Note that the C symlink call uses 0-terminated string, and
the Rust stdlib exports it as accepting `Path`. To be safe, we encode binary
or non-utf8 content using `hex`.

For downgrade safety, the write path does not use symlink by default unless
format.use-symlink-atomic-write is set to true. This makes downgrade possible:
the read path is rolled out first, then we can turn on and off the write path.

The indexedlog Rust unit tests and test-doctor.t are migrated to use the new
symlink code paths.

Reviewed By: DurhamG

Differential Revision: D20153864

fbshipit-source-id: c31bd4287a8d29575180fbcf7227d2b04c4c1252
2020-03-04 07:23:48 -08:00
Jun Wu
def12896db indexedlog: add a utility function to read files crated by atomic_write
Summary:
This makes it possible to implement atomic_write differently (ex. use a
symlink).

Reviewed By: DurhamG

Differential Revision: D20153865

fbshipit-source-id: 07fa78c2f2dac696668f477c75f65cf70950b73f
2020-03-04 07:23:47 -08:00
Jun Wu
7c5e47bab1 indexedlog: rename chunk_size_log to chunk_size_logarithm
Summary:
This makes it clear that `log` is a math concept, not an append-only file like
`Log`.

Reviewed By: DurhamG

Differential Revision: D20149376

fbshipit-source-id: 67d2e9584b15f48759ca9b6dfce4279a5b1365a0
2020-03-03 13:41:28 -08:00
David Tolnay
e988a88be9 rust: Rename futures_preview:: to futures::
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/

This codemod replaces *all* dependencies on `//common/rust/renamed:futures-preview` with `fbsource//third-party/rust:futures-preview` and their uses in Rust code from `futures_preview::` to `futures::`.

This does not introduce any collisions with `futures::` meaning 0.1 futures because D20168958 previously renamed all of those to `futures_old::` in crates that depend on *both* 0.1 and 0.3 futures.

Codemod performed by:

```
rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
    -x \
    buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:futures-preview, 1))" \
| xargs sed -i 's,\bfutures_preview::,futures::,'

rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| xargs sed -i 's,//common/rust/renamed:futures-preview,fbsource//third-party/rust:futures-preview,'
```

Reviewed By: k21

Differential Revision: D20213432

fbshipit-source-id: 07ee643d350c5817cda1f43684d55084f8ac68a6
2020-03-03 11:01:20 -08:00
Genevieve Helsel
528015f9fe allow more hg fastpath cases
Reviewed By: simpkins

Differential Revision: D20143888

fbshipit-source-id: 4b1a73159bde6835626ad1766b2cf9dcd2faf6c4
2020-03-02 07:43:39 -08:00
Jun Wu
c718a5dc19 pathmatcher: add a test about a bug in globset/aho-corasick
Summary:
Also patch aho-corasick to fix the issue.

The issue was introduced by [an optimization path](063ca0d253) added in aho-corasick 0.7 series (used by globset 0.4.3).
aho-corasick 0.6.x (globset 0.4.2) are not affected.

The next aho-corasick release (0.7.9) contains the fix.

See https://github.com/BurntSushi/aho-corasick/issues/53 for more context.

Reported by: yns88

Reviewed By: DurhamG

Differential Revision: D20125697

fbshipit-source-id: 592375b43d7ee494bb3e916a1cb11c18f9ebe425
2020-02-28 22:09:28 -08:00
Jun Wu
10bb5a144e revset: replace some repo.revs with repo.nodes
Summary:
Migrate away from some uses of revision numbers.
Some dead code in discovery.py is removed.

I also fixed some test issues when I run tests locally.

Reviewed By: sfilipco

Differential Revision: D20155399

fbshipit-source-id: bfdcb57f06374f9f27be51b0980652ef50a2c8e0
2020-02-28 17:45:26 -08:00
Jun Wu
0220b4a0c3 nameset: make NameIter Send
Summary: This makes it possible to use NameIter in py_class.

Reviewed By: sfilipco

Differential Revision: D20020529

fbshipit-source-id: b9147b7dccb38d18d8361b420507fcbe97e01351
2020-02-28 16:35:25 -08:00
Jun Wu
782f2017aa dag: add hex prefix lookup
Summary: This will be used by commit hash prefix lookup.

Reviewed By: sfilipco

Differential Revision: D20020523

fbshipit-source-id: f2905ddf63098704b08dad8eb48272c3ffba7e25
2020-02-28 16:35:24 -08:00
Jun Wu
12441f48bf dag: re-export common types at top-level
Summary: Export common types at the top-level of the crate so it's easier to use.

Reviewed By: sfilipco

Differential Revision: D20020526

fbshipit-source-id: e9a0a8bc3cc91f81d0bc74e7530cd4613fc1dd61
2020-02-28 16:35:23 -08:00
Jun Wu
bc9f72ccf3 dag: implement DAG algorithms on NameDag
Summary: Those just delegate to IdDag for the actual calculation.

Reviewed By: sfilipco

Differential Revision: D20020522

fbshipit-source-id: 272828c520097c993ab50dac6ecc94dc370c8e8b
2020-02-28 16:35:23 -08:00
Jun Wu
b88da34fb0 dag: expose NameDag in tests
Summary: This allows tests to check NameDag APIs.

Reviewed By: sfilipco

Differential Revision: D20020525

fbshipit-source-id: 4ee8e4bcbd0731512ba17068e827b8045fc5d522
2020-02-28 16:35:23 -08:00
Jun Wu
194cd25f4f dag: add Arc<IdMap> to NameDag
Summary: This will be used to produce NameSet.

Reviewed By: sfilipco

Differential Revision: D20020519

fbshipit-source-id: abf6d73f2b985b74560d6b5db2800ff25450f02e
2020-02-28 16:35:22 -08:00
Jun Wu
7a343271b9 dag: rename NameDag::parents to NameDag::parent_names
Summary: This matches IdDag::parents (taking a set) and IdDag::parent_ids.

Reviewed By: sfilipco

Differential Revision: D20020524

fbshipit-source-id: 6e90727c355a7400f9a23e0b25e3392bdc032f49
2020-02-28 16:35:22 -08:00