Commit Graph

367 Commits

Author SHA1 Message Date
Mark Thomas
5666399fcf mutationstore: switch mutation entry timestamp from f64 to i64
Summary:
The mutation store stores entries with a floating-point timestamp.  This
pattern was copied from obsmarkers.

However, Mercurial uses integer timestamps in the commit metadata (the
parser supports floats for historical reasons, but only stores integer
timestamps).   Mononoke also uses integer timestamps in its `DateTime`
type.

To keep things simple, switch to using integer timestamps for mutation
entries.  Existing entries with floating point timestamps are truncated.

Add a new entry format version that encodes the timestamp as an integer.
For now, continue to generate the old version so that old clients can
read entries created by new clients.

Reviewed By: quark-zju

Differential Revision: D20444366

fbshipit-source-id: 4d6d9851aacb314abea19b87c9d0130c47fdf512
2020-03-17 04:18:44 -07:00
Mark Thomas
ac80212e8f mutationstore: remove mutation entry origins
Summary:
Tracking the origin of mutation entries did not prove useful, and just creates
an un-necessary overhead.  Remove the tracking and repurpose the field as a
version field.

Reviewed By: quark-zju

Differential Revision: D20444365

fbshipit-source-id: 65ff11ee8cfe77d5e67a83d03a510541d58ef69b
2020-03-17 04:18:44 -07:00
Xavier Deguillard
deffd9a477 minibytes: address clippy warnings
Summary: Using ptr.add is shorter and preferred to ptr.offset.

Reviewed By: quark-zju

Differential Revision: D20452752

fbshipit-source-id: 1dc2fdbc392267d2d690673c10dcc161ecd00dfa
2020-03-16 14:58:22 -07:00
Xavier Deguillard
67c8cf22a3 hgtime: address clippy warnings
Summary:
These warnings are fairly trivial, as it recommends using single quote (char)
for single characters search instead of a double quote (str).

Reviewed By: quark-zju

Differential Revision: D20452408

fbshipit-source-id: b2951e133e57633a8e766536e22969fa9ac0ecee
2020-03-16 14:58:22 -07:00
Xavier Deguillard
bb30c40375 types: address clippy warnings
Summary:
Clippy had 3 sources of warnings in this crate:
 - from_str method not in impl FromStr. We still have 2 of them in path.rs, but
   this is documented as not supported by the FromStr trait due to returning a
   reference. Maybe we can find a different name?
 - Use of mem::transmute while casts are sufficient. I find the cast to be
   ugly, but they are simply safer as the compiler can do some type checking on
   them.
 - Unecessary lifetime parameters

Reviewed By: quark-zju

Differential Revision: D20452257

fbshipit-source-id: 94abd8d8cd76ff7af5e0bbfc97c1e106cdd142b0
2020-03-16 14:58:21 -07:00
Xavier Deguillard
82d3c7f544 configparser: address clippy warnings
Summary:
Clippy complains about 3 things:
 - Using raw pointers in a public function that is not declared as unsafe. This
   happens for C exported ones, this feels like a warning, so I haven't changed
   it.
 - Using .map(...).unwrap_or(<default value constructed>). The recommendation
   is to use .unwrap_or_default().
 - Single match instead of if let, the latter makes code much shorter.

Reviewed By: quark-zju

Differential Revision: D20452751

fbshipit-source-id: 8eeff7581c119c651ca41d8117f1f70f15774833
2020-03-16 14:53:45 -07:00
Stefan Filip
1fb5acf242 dag: use IdDagStore in IdDag with type parameter
Summary: Make IdDag storage generic by depending on IdDagStore.

Reviewed By: quark-zju

Differential Revision: D20471712

fbshipit-source-id: 3a2668f301758a3c880db35c9f0db6887ef1dd38
2020-03-16 14:41:41 -07:00
Stefan Filip
236292c0fd dag: add the GetLock trait
Summary: Used to generalize `get_lock` functionality.

Reviewed By: quark-zju

Differential Revision: D20471710

fbshipit-source-id: e44d5b22ecacdb653170ef83914354f521f82dfc
2020-03-16 14:41:40 -07:00
Stefan Filip
66436b4a3c dag: add the IdDagStore trait
Summary: Abstract the storage functionality required by IdDag.

Reviewed By: quark-zju

Differential Revision: D20449122

fbshipit-source-id: fc3c7d7b88d74f7a93670d310be2e680f35e8ce7
2020-03-16 14:41:40 -07:00
Stefan Filip
1239628ef8 dag: move IdDag storage details to the iddagstore module
Summary:
Right now the module has one implementation IndexedLogStore. The name could
be more specific in the context of the crate.

The goal will be to add a trait for storage requirements of IdDag and
make IndexedLogStorage one implementation of that trait.

Reviewed By: quark-zju

Differential Revision: D20446042

fbshipit-source-id: 7576e1cc4ad757c1a2c00322936cc884838ff710
2020-03-16 14:41:40 -07:00
Jun Wu
1f64b4ec50 nameset: fix LazySet iteration
Summary:
The `next` method forgot to increase the iteration index, causing infinite
iteration.

Reviewed By: ikostia

Differential Revision: D20473206

fbshipit-source-id: 82a95de1b1c12ac4e9e4d328a0adba7145d7b24c
2020-03-16 13:00:35 -07:00
Jun Wu
8115053c00 indexedlog: implement xxd-like fmt::Debug for Log
Summary: This makes `hg debugindexedlog dump` more useful.

Reviewed By: sfilipco

Differential Revision: D20448863

fbshipit-source-id: c5cc24449ae00ee329ce02bf0adf947ff57e72ed
2020-03-16 10:21:46 -07:00
Durham Goode
a13fcd4910 workingcopy: support returning directories from the walker
Summary:
Purge needs to be able to see what directories the walker traversed, so
it can delete them if they are empty. Instead of having the walker call
match.traversedir (which it seems like a bizarre pattern to use the matcher as a
holder for a non-matching related function), let's have the walker return an
enum and have an option to return directories.

At the python layer we then translate this into match.traversedir calls, but we
can clean that up later.

Reviewed By: quark-zju

Differential Revision: D19543795

fbshipit-source-id: cc51c86c91799d3df2c65d25a7b6cfe810206d0a
2020-03-16 10:15:26 -07:00
Durham Goode
fc7739fa26 workingcopy: rename walker results
Summary:
In preparation for supporting returning directories from the walker (to
support purge), let's rename the result structure to be more generic.

Reviewed By: kulshrax

Differential Revision: D19543791

fbshipit-source-id: 9b71452c879cf397ae92533a4ef4727140ac7369
2020-03-16 10:15:26 -07:00
Durham Goode
05e09b2b89 workingcopy: report invalid file types from rust walker
Summary:
The mercurial tests print errors when they encounter 'fifo' files.
Let's handle that case.

Differential Revision: D19543796

fbshipit-source-id: f87d4b9c3f0ad8b8d8ebe2e6d18e325fc93d0ae9
2020-03-16 10:15:25 -07:00
Xavier Deguillard
fd8d92f1f5 revisionstore: allow indexing LFS pointers via sha256
Summary:
While the sha256 of a blob gives access to its content, it doesn't allow
accessing its metadata, by adding a sha256 index, we can easily get the
metadata of a blob via its content hash.

Reviewed By: quark-zju

Differential Revision: D20445624

fbshipit-source-id: 42c04bd69d3c7380706c6237c5b4f4061c016cca
2020-03-13 19:03:29 -07:00
Xavier Deguillard
d9cca63444 types: add a into_inner method to Sha256
Reviewed By: quark-zju

Differential Revision: D20445623

fbshipit-source-id: d9cba7ddd16a8e89c76cd5e988ab0fb79383d0c2
2020-03-13 19:03:29 -07:00
Xavier Deguillard
60be0ac94d types: fix typo when displaying Sha256
Reviewed By: quark-zju

Differential Revision: D20445622

fbshipit-source-id: dc9a8a165ca55fdece90a5eb3a87cd3c28f444cb
2020-03-13 19:03:29 -07:00
Xavier Deguillard
6ee3a8f42f revisionstore: add metadata to FakeHgIdRemoteStore
Summary: This is necessary to properly test LFS stores.

Reviewed By: quark-zju

Differential Revision: D20445625

fbshipit-source-id: 530ddf87249e8d721957806f2d8edef3262f303c
2020-03-13 19:03:28 -07:00
Xavier Deguillard
5002d01e0a revisionstore: allow indexedlogutil users to lookup in different indices
Summary:
The OpenOptions allow for multiple indices to be added, but lookup had no way
to querying these multiple indices.

Reviewed By: quark-zju

Differential Revision: D20445627

fbshipit-source-id: 0cb754ba17b452d892b7bcb56d502d5753ef963a
2020-03-13 19:03:28 -07:00
Xavier Deguillard
01fb3c0a77 revisionstore: add a new StoreKey type
Summary:
This type can either be a Mercurial type key, or a content hash based key. Both
the prefetch and get_missing now can handle these properly. This is essential
for stores where data can either be fetched in both ways or when the data is
split in 2. For LFS for instance, it is possible to have the LFS pointer (via
getpackv2), but not the actual blob. In which case get_missing will simply
return the content hash version of the StoreKey, to signify what it actually
has missing.

Reviewed By: quark-zju

Differential Revision: D20445631

fbshipit-source-id: 06282f70214966cc96e805e9891f220b438c91a7
2020-03-13 19:03:28 -07:00
Xavier Deguillard
d900874401 revisionstore: rename HistoryStore to HgIdHistoryStore
Summary:
Similarly to the DataStore trait, this makes it easier to understand that they
deal with a Mercurial type Key.

Reviewed By: quark-zju

Differential Revision: D20445621

fbshipit-source-id: a1143d5f5d6a2c8686d517a6ea3c25b07c0df072
2020-03-13 19:03:27 -07:00
Xavier Deguillard
2e4742cefc revisionstore: rename DataStore traits to HgIdDataStore
Summary: This makes it clear that these traits are dealing with Mercurial Key.

Reviewed By: quark-zju

Differential Revision: D20445626

fbshipit-source-id: d5acbf442e9407b973e95e40af69b5a61bff0a4d
2020-03-13 19:03:27 -07:00
Jun Wu
cf04fe3e1f thrift-types: recompile Thrift sources
Summary: The thrift compiler and sources are changed.

Reviewed By: xavierd

Differential Revision: D20445164

fbshipit-source-id: f20f16ae02a922042f366a9a80a3642577f60e57
2020-03-13 14:25:23 -07:00
Jun Wu
7a7f98f1b2 configparser: migrate from Bytes to Text
Summary:
Since configparser enforces utf-8 config files (because pest wants Rust strings),
let's migrate from Bytes to Text to remove extra encoding conversions.

Previously this was blocked by the lack of ref-counted text (since the "source"
of each config location is the entire config file). Now minibytes provides Text
so we can use it.

This unfortunately requires dependent code to be updated. The pyconfigparser
interface is in theory wrong - it shouldn't return utf-8 bytes but
local-encoded bytes. I think it's cleaner to make pyconfigparser unaware of
HGENCODING, so I changed pyconfigparser to use unicode, and add compatibility
layer in uiconfig.py.

This also fixes non-ascii encoding issues on user name (especially on Windows).
The hgrc config file should be in utf-8 and the config parser returns explicit
unicode types, and Python code round-trip them with local encodings.

Reviewed By: markbt

Differential Revision: D20432938

fbshipit-source-id: b1359429b8f1c133ab2d6b2deea6048377dfeca1
2020-03-13 10:51:41 -07:00
Jun Wu
715bc5d451 configparser: migrate from bytes to minibytes
Summary:
This makes it easier to further migrate to `Text` interface.
Dependent crate (`auth`) is updated.

Reviewed By: markbt

Differential Revision: D20432941

fbshipit-source-id: 1dc29d52c9b17ce14676ef0555470c6d36a09c2b
2020-03-13 10:51:41 -07:00
Jun Wu
c4ec99ded4 minibytes: implement Text
Summary:
Text is a reference-counted shared String.
It's similar to Bytes but works for utf-8 strings.

The motivation is to replace configparser's use of Bytes to Text.

Reviewed By: markbt

Differential Revision: D20432940

fbshipit-source-id: ef990255d269e60d433c6520819f60ccdcbe488f
2020-03-13 10:51:41 -07:00
Jun Wu
7895e70dcf minibytes: make Bytes abstract
Summary: This makes it possible to implement "Text". See the next diff.

Reviewed By: markbt

Differential Revision: D20432943

fbshipit-source-id: 94b3810ab205c260d33f57bd637e4accc3ee871d
2020-03-13 10:51:40 -07:00
Jun Wu
e9b14b3608 minibytes: implement From<&'static {str,[u8]}>
Summary:
This makes the API easier to use.

Practically this makes it easier for configparser to migrate to minibytes.

Reviewed By: markbt

Differential Revision: D20432942

fbshipit-source-id: ad08eb118d2216054dc24c86b0b129ae82b9d17c
2020-03-13 10:51:40 -07:00
Jun Wu
ad8190713b cpython-ext: serialize Rust str into Python str type
Summary:
Previously Rust str was serialized into bytes. To be Python 3 friendly, let's
serialize it into `str`.

Reviewed By: markbt

Differential Revision: D19797706

fbshipit-source-id: 388eb044dc7e25cdc438f0c3d6fa5a5740f22e3d
2020-03-12 12:19:38 -07:00
Jun Wu
3376363721 tracing-collector: add is_event to TreeSpan
Summary: Expose the is_event property via public APIs.

Reviewed By: DurhamG

Differential Revision: D19797705

fbshipit-source-id: f441825e98208964f7b3d6815a177b464430cbb7
2020-03-12 12:19:38 -07:00
Stanislau Hlebik
ba871d3bdc xdiff: allow rendering diff for large files
Summary:
The goal of the stack is to support "rendering" diffs for large files in scs
server. Note that rendering is in quotes - we are fine with just showing a
placeholder like "Binary file ... differs". This is still better than the
current behaviour which just return an error.

In order to do that I suggest to tweak xdiff library to accept FileContentType
which can be either Normal(...) meaning that we have file content available, or
Omitted, which usually means the file is large and we don't even want to fetch it, and we
just want xdiff to generate a placeholder.

Reviewed By: markbt, krallin

Differential Revision: D20389226

fbshipit-source-id: 0b776d4f143e2ac657d664aa9911f6de8ccfea37
2020-03-12 04:27:23 -07:00
Jun Wu
194b38385a nameset: add a way to convert between NameSet and SpanSet
Summary:
This will be used in the Python world for legacy reasons. It shouldn't be used
in new Rust node.

To use it, the name `LegacyCodeNeedIdAccess` has to be used so we can do a code
search to find all users of it.

Reviewed By: sfilipco

Differential Revision: D20367834

fbshipit-source-id: 9b93a29f1461ce24bba6f31a2bbb1f327e216c6d
2020-03-11 20:37:30 -07:00
Jun Wu
eef56d9c5b namedag: add a sort API
Summary: This will be useful to actually sort commits.

Reviewed By: sfilipco

Differential Revision: D20367835

fbshipit-source-id: 43bc7835277af3a14ef323ce34247e0c03878dc8
2020-03-11 20:37:29 -07:00
Jun Wu
2ecc0bb757 namedag: move "all" concept to DagSet
Summary:
The old "AllSet" implementation is not very practical - it does not support
iteration. Practically, the "all()" set comes from the DAG. Change the "all"
concept to a hint similar to "is_topo_sorted", and update the fast path
(intersection) accordingly.

Reviewed By: sfilipco

Differential Revision: D20367837

fbshipit-source-id: fdbf370897c93058bfcab0571c1f6fa4b99b0f6b
2020-03-11 20:37:29 -07:00
Jun Wu
ef1696b4db namedag: rename arc_map to snapshot_map
Summary: The word "snapshot" more accurately describes its purpose.

Reviewed By: sfilipco

Differential Revision: D20367836

fbshipit-source-id: c91a0bd402fa1718b5d805beedc0e062824c53d3
2020-03-11 20:37:29 -07:00
Jun Wu
c5c75c9f59 fsinfo: autocorrect "" to "."
Summary:
Without this:

  In [3]: util.getfstype('')
  IOError: [Errno 2] No such file or directory (os error 2)

And there is a code path hitting this:

  File "edenscm/mercurial/util.py", line 1483, in checknlink
    fstype = getfstype(os.path.dirname(testfile))
		# testfile = '.'
	  # os.path.dirname(".") = ""

The old implementation works fine for an empty path:

	In [2]: m.util.getfstype('')
  Out[2]: 'eden'

So let's make the new Rust implementation consistent.

Reviewed By: xavierd

Differential Revision: D20313387

fbshipit-source-id: 258c424a3e8a796d983e20b0d4656e8e3f413706
2020-03-11 17:35:40 -07:00
Jun Wu
61bebcaacc fsinfo: try harder to get fuse fs type
Summary: Similar to D13982877. Try to get names like "fuse.ntfs".

Reviewed By: farnz

Differential Revision: D20313392

fbshipit-source-id: 8363d3d92843e6afb53a0003950be083034bd841
2020-03-11 17:35:39 -07:00
Jun Wu
13374f9d74 fsinfo: drop most type parameters
Summary:
Only keep type parameters at the top-level function.
This reduces the binary size and speeds up rustc.

Reviewed By: xavierd

Differential Revision: D20313388

fbshipit-source-id: 29d77731ff462fee1f1bb9f234601e3430198ae7
2020-03-11 17:35:39 -07:00
Jun Wu
c83006002c fsinfo: return unknown on unsupported platforms
Summary: This makes the code a bit more portable.

Reviewed By: xavierd

Differential Revision: D20313389

fbshipit-source-id: 080538939fa4d2d72e5905f23ad9be987d952748
2020-03-11 17:35:38 -07:00
Jun Wu
9cdc818915 fsinfo: drop "repo" from method names
Summary:
Rename the main method to "fstype". The API has no relation with repo.
So let's rename it.

Reviewed By: xavierd

Differential Revision: D20313386

fbshipit-source-id: 80dd1231ccccfe945150b117b151bce773f0dfeb
2020-03-11 17:35:38 -07:00
Jun Wu
951c8ab082 fsinfo: backport from telemetry
Summary: The fsinfo crate provides the "filesystem type" information.

Reviewed By: xavierd

Differential Revision: D20313391

fbshipit-source-id: f717f5edb32957d59d03090117cfdb8123f03933
2020-03-11 17:35:37 -07:00
Xavier Deguillard
f466037b4b revisionstore: fix memcache test flakiness
Summary:
Since the mocked memcache is shared between the tests, we need to make sure the
keys used by the tests are different, otherwise they are just caching each
others data.

Reviewed By: ikostia

Differential Revision: D20388783

fbshipit-source-id: 0f2f926e0ffe0e52e55291e46142808ce0921288
2020-03-11 15:58:03 -07:00
Jun Wu
97e9b81ba5 indexedlog: remove compiler warnings on Windows
Summary:
Some `use`s are not used on Windows. The code was also formatted using the
latest rustfmt.

Reviewed By: xavierd

Differential Revision: D20379704

fbshipit-source-id: ffadcd68e4e0440dcbd2a4e1ad8532b47a9d83e2
2020-03-11 15:54:19 -07:00
Xavier Deguillard
c98b9cfff9 revisionstore: remove Arc from MetadataStore
Summary: Similarly to the ContentStore, remove the Arc from MetadataStore.

Reviewed By: quark-zju

Differential Revision: D20376838

fbshipit-source-id: 4321600b752c919b6d9fa7bdee6f6cb7ae083b10
2020-03-11 13:39:06 -07:00
Xavier Deguillard
7e704ec7fb revisionstore: remove the Arc from ContentStore
Summary:
The clients should use an Rc/Arc if they need the ability to clone it. This
makes it more obvious and reduces the number of pointer indirection.

Reviewed By: quark-zju

Differential Revision: D20376839

fbshipit-source-id: c56e7e8f89ab17727be621894c329e344a7f3adb
2020-03-11 13:39:05 -07:00
Jun Wu
4960709aa3 dag: do not depend on types
Summary:
The dag crate is designed to work with any kind of binary commit hashes (ex. bonsai,
git or hg). The only use of `types` is to convert from binary to hex. Since dag
already has its own `to_hex` logic in `VertexName`. Let's use that instead.

Reviewed By: sfilipco

Differential Revision: D20378447

fbshipit-source-id: 00ecb551ea927fdb60dd91e5e645064f23139bcd
2020-03-11 10:49:31 -07:00
Jun Wu
009ea22175 indexedlog: retry rename in atomic_write on Windows
Summary:
Recently there are some Windows-related test flakiness in . All of them are
caused by `file.persist(path)` in `atomic_write_plain` failing with
"Access Denied". Since that can be caused by Windows Anti-Virus scans or other
weird stuff, let's workaround around it using automatically retires.

Process Explorer does not provide extra information:

    indexedlog-d0c6135fd7ed9ece.exe	5868	SetRenameInformationFile	C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\.tmpcfDsQQ	ACCESS DENIED	ReplaceIfExists: True, FileName: C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\meta

A successful rename looks like:

    indexedlog-d0c6135fd7ed9ece.exe	5868	SetRenameInformationFile	C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\.tmpbXEVw0	SUCCESS	ReplaceIfExists: True, FileName: C:\Users\quark\AppData\Local\Temp\.tmpKERc5G\meta

Reviewed By: ikostia

Differential Revision: D20379618

fbshipit-source-id: db3e6be3d785875486f7a517df11cbf58bf65ddd
2020-03-11 10:06:47 -07:00
Xavier Deguillard
5d230aef68 backingstore: use get_file_content to strip metadata
Summary:
Now that the ContentStore can automatically strip the metadata header, no need
for duplicated code in the backingstore.

Reviewed By: fanzeyi

Differential Revision: D20376812

fbshipit-source-id: e863e1cc2fcdc8b9e612a464b305fa25ceb66e13
2020-03-11 09:40:26 -07:00
Xavier Deguillard
40bbe7b4da merge: add a Rust threaded file updater
Summary:
During `hg update`, Mercurial forks multiple processes to write files on disk
concurrently, this is done as fetching blobs from the content store, and
writing them to disk is CPU bound. Usually, threads would be the preferred way
of speeding up such process, but unfortunately, Python has GIL that severely
limit the available concurrency. So, multiple processes were chosen.

Unfortunately, the multi-process solution also brings a lot of other issues,
more recently, we've had cases where the connections to the server and memcache
had to be dropped after the fork. In some other cases, this caused deadlocks.
And the solution is not effective on Windows.

Now that Mercurial is getting more and more Rust, we could instead go back to
the threads solution by using them in Rust, and have Python just push work to
them, this is exactly what this change does.

Things that are left to be done, but I wanted to get a diff out first:
 - no file path audit
 - no file backup
 - no symlink creation
 - probably other things I'm missing

Reviewed By: quark-zju

Differential Revision: D20102888

fbshipit-source-id: d47829fd7818b97710586b9851880f178048e27b
2020-03-11 01:13:54 -07:00