Commit Graph

43068 Commits

Author SHA1 Message Date
Jun Wu
d0886b9b94 gitignore: add a config option
Summary:
gitignore could have performance issues stating .gitignore files everywhere.
That happens if watchman returns O(working copy) files. Add a config to
disable it as we're finding solutions.

Reviewed By: DurhamG

Differential Revision: D7482499

fbshipit-source-id: 4c9247b0318bf034c8e9af4b74c21110cc598714
2018-04-13 21:51:45 -07:00
Alexandre Marin
1b393dbae1 Get file contents from local storage if possible
Summary:
Turns out I incorrectly assessed this situation before. We do use content from
perforce servers a lot. This change makes p4seqimport read from local disk
directly if possibel rather than resorting solely on `p4 print` to obtain file content.

```name=Checking file content src on master-importer task 0 (running for 15h+)
[15:40:23 twsvcscm@priv_global/independent_devinfra/ovrsource-master-importer/0 ~]$ egrep -o 'src: (gzip|rcs|p4)' /logs/stdout | sort | uniq -c
   2567 src: gzip
     24 src: p4
```

Differential Revision: D7388797

fbshipit-source-id: 5fe1a525bc211d64a75954d529edc152d22970a7
2018-04-13 21:51:45 -07:00
Phil Cohen
502ebf39fb add mutablepack.destpath, set once a pack gets written
Summary:
Subsequent commits will need the new path of a mutable{data, hist}pack -- this makes
that data accessible.

Reviewed By: DurhamG

Differential Revision: D7369226

fbshipit-source-id: f6849aaed747fbd9afee7191e6a0e5e1357ca618
2018-04-13 21:51:45 -07:00
Durham Goode
e53a3e253a hg: don't use statvfs when it's not available
Summary: fastmanifest used the statvfs function to be smart about how much disk space it used. That function isn't available on windows though. This optimization is optional, and we probably won't end up using the fastmanifest cache on windows anyways, so let's just skip it if its not available.

Reviewed By: quark-zju

Differential Revision: D7478478

fbshipit-source-id: e9595f3fef397d66d76f3ecfa54f8e4328ce0921
2018-04-13 21:51:45 -07:00
Alexandre Marin
d3523cdc77 Feedback
Summary:
dsp had a look at the whole stack and suggested some changes:

* Only write bookmark once at the end of the import - we are doing a single transaction anyways so updating the bookmark after every changelist import is moot
* Remove unused function seqimporter.ChangelistImporter._safe_open
* Require fncache to preserve behavior from p4fastimport

Differential Revision: D7375481

fbshipit-source-id: f4407d5d0276f96d72bf67544091640fe1c46044
2018-04-13 21:51:44 -07:00
Alexandre Marin
275306c97a importer - use seqimport
Summary: Updates the importer wrapper to use the new p4seqimport, replacing p4fastimport.

Differential Revision: D7326764

fbshipit-source-id: 588486bfd747086396f47e678da05c6eafd30565
2018-04-13 21:51:44 -07:00
Alexandre Marin
b807fcb05d Make it work with remotefilelog
Summary:
When testing p4seqimport with remotefilelog it would barf on call to `.tip()`,
because remotefilelog doesn't have that.

This change makes use of the change context from the repo instead to get the
tip node.

Differential Revision: D7294979

fbshipit-source-id: 18b4a5107f4cbf676016d44d5134bf0d252eeff3
2018-04-13 21:51:44 -07:00
Alexandre Marin
738df53c01 Test branching
Summary:
Testing that p4seqimport works properly for branching
Based on comment on D7172867

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7203765

fbshipit-source-id: 2e328f5b43fc47a60bfe2c41f9454f8471dda814
2018-04-13 21:51:44 -07:00
Alexandre Marin
9e6c6e4e3f Handle p4 keyworded files
Summary:
Perforce supports RCS keyworded files, more info here:
http://answers.perforce.com/articles/KB/3482

We replace things back in p4fastimport, this replicates the behavior in
p4seqimport (unit test should clarify what this means)

Differential Revision: D7188163

fbshipit-source-id: 594f71d6114c73001753ae36c4973c2db3310e62
2018-04-13 21:51:44 -07:00
Alexandre Marin
d4c8ee992a Track executable files
Summary:
Respect the executable bit on files based on perforce type.

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7185388

fbshipit-source-id: 59afec7bd857572b8347ebe546d131017a79928c
2018-04-13 21:51:44 -07:00
Alexandre Marin
e2fc55f8a6 Use memctx to create commit without working copy
Summary:
p4seqimport has used very high level mercurial abstractions so far (almost
equivalent to running hg add / mv / rm / commit on command line). This is very
easy to grasp as we use it day to day. It is not performant enough for our
importer:
- It does the work twice (write to working copy, then commit changing hg metadata)
- It requires the working copy (this would force us to update between revs,
  materializing a prohibitively large number of files)

This change makes use of memctx, which is basically an in-memory commit. This way
we don't need a working copy and we save time + a lot of space.

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7176903

fbshipit-source-id: 2773d7c001b615837496ea9db3229d9afc020124
2018-04-13 21:51:44 -07:00
Alexandre Marin
50cae88150 Update bookmark after importing change
Summary:
p4seqimport has a bookmark option, it was completely ignored before this change.
This makes use of the opt, moving the bookmark as we import changes.

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7172867

fbshipit-source-id: be63765088b0583df2e1c9e0ccec869c5278d782
2018-04-13 21:51:44 -07:00
Alexandre Marin
12cda51997 Correctly treat symlinks
Summary:
Properly create files as symlinks if they are symlinks in P4

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Reviewed By: wlis

Differential Revision: D7157772

fbshipit-source-id: ac3e5010f3d15460592a449c817824c0b28a8435
2018-04-13 21:51:44 -07:00
Alexandre Marin
e92856a0dc Deal with large files and their metadata
Summary:
Similar to #10 (D7113181), we need to track large files.
This change adds the bits to do so, reusing the logic from p4fastimport which was
moved to lfs.py

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7115654

fbshipit-source-id: 56ccfadf6fa14dcfb8005cc5ef03fb175835bcda
2018-04-13 21:51:44 -07:00
Alexandre Marin
ff0f4d75e9 Update revision mapping with CL/hg commit info
Summary:
This change makes seqimport write revision info (i.e. (CL, hghash) pairs) to a
sqlite file. This is used by the importer TW job wrapper to write the info
into `xdb.p4sync` table `revmap`

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7113181

fbshipit-source-id: e55a8cf0b794216a4855ae7486885c3d956cd7fb
2018-04-13 21:51:44 -07:00
Alexandre Marin
b646c66e31 Add p4changelist to commit extra metadata
Summary:
Adds p4changelist to commit extra info
With p4changelist info, make p4seqimport incremental
Add debug message to have more accurate info on what is actually being imported

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7090090

fbshipit-source-id: 17529aa57452453cfe29c3c3dc9d9e7daa8cffb2
2018-04-13 21:51:44 -07:00
Alexandre Marin
71b806ac03 Copy tracing
Summary:
Adds copy tracing to `p4seqimport` by:
- Leveraging `fromFile` from `p4 -ztag describe` to introduce source for moved
files into P4Changelist.load's
- Utilizing that info from P4 CL when creating hg commit

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7074892

fbshipit-source-id: e105a608bb953a8137ec6c9afc7e0571a902c868
2018-04-13 21:51:43 -07:00
Alexandre Marin
76a6c3893d Consolidate CL description, user and date manipulations
Summary:
Consolidates manipulation of p4 CL info into p4 module, pulling the relevant code
out of ChangeManifestImporter creategen so it can be easily shared by
p4fastimport and p4seqimport

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7064179

fbshipit-source-id: 72c5bcad209eebf40ec8152a07f98f7f7fa544fb
2018-04-13 21:51:43 -07:00
Alexandre Marin
17023648b6 Create commit
Summary:
Adds logic to create the commit, using info from p4 CL + the list of added and
removed files.

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7063983

fbshipit-source-id: c64e44c19d06e54fe35121a8d6128de050f93823
2018-04-13 21:51:43 -07:00
Alexandre Marin
4ee3972fb0 Get file content from p4, write to hg
Summary:
Read file from perforce, write into the hg repo.

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7050157

fbshipit-source-id: 4389ba11f62c8ed825d6a6ef3c001095339eb551
2018-04-13 21:51:43 -07:00
Alexandre Marin
88030f9c32 ChangelistImporter
Summary:
Creates ChangelistImporter, which will be responsible for translating a p4 CL to
a hg commit

For now it only goes through files touched by the CL and lists what was added or
removed. Next diffs will evolve it to the point where it effectively performs the
translation.

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7049961

fbshipit-source-id: 6a9f3bd57cadc2b9ea8a81373cc10dfda76311e7
2018-04-13 21:51:43 -07:00
Alexandre Marin
06f53fe081 Get changelists to import
Summary:
Pulls the logic to define changelists from p4fastimport into separate function
and re-uses that in p4seqimport

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7035674

fbshipit-source-id: 699e9148d35e437f306062f290c8ec2a857df480
2018-04-13 21:51:43 -07:00
Alexandre Marin
1d63473120 Sanitize opts
Summary:
This change:
Moves some opts sanitizing logic into function `sanitizeopts`
Adds checks for `limit` being a positive integer
Uses `sanitizeopts` new function in p4seqimport
Adds a test covering `sanitizeopts`

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7035217

fbshipit-source-id: cd677fb254ff83d123673d51a1c682639de08a30
2018-04-13 21:51:43 -07:00
Alexandre Marin
a3df4258d7 Base setup, enforce p4 client exists
Summary:
p4seqimport will be the new command to import from p4 to hg changelist by
changelist. This should provide us with a more robust importer that doesn't rely
on fiddling with hg's data structures directly. p4fastimport was important to
create ovrsource from scratch and import thousands of changelists, but moving
forward it is probably safer and easier to understand/maintain something that is
based on higher level Mercurial APIs

All that said, this is the first change, this change:
1. Creates p4seqimport command as part of the p4fastimport extension
2. Refactors the p4 client checking logic into `enforce_p4_client_exists`
3. Adds a test that checks the new function works through using `p4seqimport`.

For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/

Differential Revision: D7015941

fbshipit-source-id: cb5c59b2f104f336a078025544a44028bf01fa85
2018-04-13 21:51:43 -07:00
Andrew Breckenridge
892836abd7 trivial: Improves documentation for hg fold
Summary: Reimplements D7340879

Reviewed By: kulshrax

Differential Revision: D7473425

fbshipit-source-id: 42df99bd00092be8e517f2af50bf9ae9fc4d5027
2018-04-13 21:51:43 -07:00
Phil Cohen
cb90ed6abe rebase: allow the working copy to be rebased with IMM
Summary:
After testing locally, I couldn't conclusively prove if rebasing a single change with IMM was any faster or slower than on disk.

Using IMM on the working copy will definitely be better for rebasing stacks, and it's just nicer to not have the working copy thrash around as much. It also might be interesting to (possibly) let you work while the rebase is running, too.* So I've added the code that will let us enable this more widely (as a subset of IMM) to experiment.

*I've made it so that if you make any changes during the rebase (causing the last update to fail), we just print a nice message telling you to checkout the new rebased working copy commit, instead of failing/aborting. TBD whether this is something we want to encourage people to do, however. I've kept the existing up-front check for uncommited changes when rebasing the WCP with IMM for now.

Reviewed By: DurhamG

Differential Revision: D7051282

fbshipit-source-id: c04302539021f481c17e47c23d3f4d8b3ed59db6
2018-04-13 21:51:43 -07:00
Jun Wu
f58ee42190 hgsubversion: fix PrefixMatch API
Summary: A matcher needs `visitdir` method defined.

Reviewed By: DurhamG

Differential Revision: D7473780

fbshipit-source-id: a2dc588e80860c44ab3746ec2120429503e16d3b
2018-04-13 21:51:43 -07:00
Durham Goode
9d58066f0c hg: fix pushbackup not converting old flat manifests into trees
Summary:
There was an issue where if the prefetch inside cansendtrees failed, it
wouldn't allow it to actually try the operation. This is undesirable, since
prefetch only talks to the server while the actual tree fetch will also attempt
to generate a tree from an old flat manifest.

Ideally we'd have a more unified flow here, where we could have the server let
us know what nodes it couldn't find, then the client could try other options for
the remaining nodes, but that requires significantly more refactoring.

Reviewed By: quark-zju

Differential Revision: D7450662

fbshipit-source-id: a023f27ee4b74786633e4dce7e62f3d9604c2b7f
2018-04-13 21:51:43 -07:00
Jun Wu
5e828307f4 indexedlog: verify checksum for all reads
Summary:
It further slows down lookups, even when checksum is disabled, since even a
`is_none()` check is not free:

  index insertion                 4.697 ms
  index flush                     3.764 ms
  index lookup (memory)           2.878 ms
  index lookup (disk, no verify)  3.564 ms
  index lookup (disk, verified)   7.788 ms

The "verified" version basically needs 2x time due to more memory lookups.

Unfortunately this means eventual lookup performance will be slower than
gdbm, but insertion is still much faster. And the index still has a better
locking properties (lock-free read) that gdbm does not have.

With correct time complexity (no O(len(changelog)) index-only operations for
example), I'd expect it's rare for the overall performance to be bounded by
index performance. Data integrity is more important.

With a larger number of nodes, ex. 2M 20-byte strings: inserting to memory
takes 1.4 seconds, flushing to disk takes 0.9 seconds, looking up without
checksum takes 0.9 seconds, looking up with checksum takes 1.7 seconds.

Reviewed By: DurhamG

Differential Revision: D7440248

fbshipit-source-id: 020e5204606f9f0a4f68843a491009a6a6f75751
2018-04-13 21:51:42 -07:00
Jun Wu
ca8f60eb0a indexedlog: verify checksum for type bytes
Summary:
This is in the critical path for lookup, and has very visible performance
penalty:

  index insertion                 3.923 ms
  index flush                     3.921 ms
  index lookup (memory)           1.070 ms
  index lookup (disk, no verify)  1.980 ms
  index lookup (disk, verified)   5.206 ms

Reviewed By: DurhamG

Differential Revision: D7440252

fbshipit-source-id: 49540f974faff1cdd0603a72328f141ccd054ee2
2018-04-13 21:51:42 -07:00
Jun Wu
55fc90dfea indexedlog: verify checksum for Mem* structs
Summary:
Previously checksum is only for `MemRoot`, now it's for all `Mem` structs.
Since `Mem*` structs are not frequently used in the normal lookup code path,
there is no visible performance change.

Reviewed By: DurhamG

Differential Revision: D7440253

fbshipit-source-id: 945f5a8c38d228f59190a487b0cf6dbc5daac4f7
2018-04-13 21:51:42 -07:00
Jun Wu
a7e3e7884d indexedlog: add a type alias for Option<ChecksumTable>
Summary:
The type will be used all over the place and may make `rustfmt` wrap lines.
Use a shorter type to make it slightly cleaner.

Reviewed By: DurhamG

Differential Revision: D7436338

fbshipit-source-id: ecaada23916a22658f65669b748632a077e60df2
2018-04-13 21:51:42 -07:00
Jun Wu
bfd8e33370 indexedlog: verify checksum for root entry
Summary:
This only affects `Index::open` right now. So it's a one time check and does
not affect performance.

Reviewed By: DurhamG

Differential Revision: D7436341

fbshipit-source-id: 30313064bf2ea50320ac744fc18c03bff4b12c89
2018-04-13 21:51:42 -07:00
Jun Wu
a0cec9853c indexedlog: add checksum table to index struct
Summary:
Add `ChecksumTable` to the `Index` struct. But it's not functional yet.
The checksum will mainly affect "index lookup (disk)" case. Add another
benchmark for showing the difference with checksum on and off. They do not
have much difference right now:

  index insertion                 3.756 ms
  index flush                     3.469 ms
  index lookup (memory)           0.990 ms
  index lookup (disk, no verify)  1.768 ms
  index lookup (disk, verified)   1.766 ms

Reviewed By: DurhamG

Differential Revision: D7436339

fbshipit-source-id: 60a6554a2c96067a53ce9e1753cd51d0d61c0bea
2018-04-13 21:51:42 -07:00
Jun Wu
8d7d4de8ee indexedlog: separate benchmarks
Summary:
The minibench framework does not provide benchmark filtering. So let's
separate benchmarks using different entry points.

Reviewed By: DurhamG

Differential Revision: D7440250

fbshipit-source-id: 11e7790a5074ebf4c08e33c312a490a66a921926
2018-04-13 21:51:42 -07:00
Jun Wu
d86adc417e indexedlog: remove "index clone" benchmarks
Summary:
The "clone" benchmarks were added to be subtracted from "lookup" to
workaround the test framework limitation.

The new minibench framework makes it easier to exclude preparation cost.
Therefore the clone benchmarks are no longer needed.

  index insertion                 3.881 ms
  index flush                     3.286 ms
  index lookup (memory)           0.928 ms
  index lookup (disk)             1.685 ms

"index lookup (memory)" is basically "index lookup (memory)" minus
"index clone (memory)" in previous benchmarks.

Reviewed By: DurhamG

Differential Revision: D7440251

fbshipit-source-id: 0e6a1fb7ee64f9a393ee9ada4db6e6eb052e20bf
2018-04-13 21:51:42 -07:00
Jun Wu
9b9dd289e4 indexedlog: use minibench to do benchmark
Summary:
See the previous minibench diff for the motivation.

"failure" was removed from build dependencies since it's not used yet.

Run benchmark a few times. It seems the first several items are less stable
due to possibly warming up issues. Otherwise the result looks good enough.
The test also compiles and runs much faster.

```
base16 iterating 1M bytes       0.921 ms
index insertion                 4.804 ms
index flush                     5.104 ms
index lookup (memory)           2.929 ms
index lookup (disk)             1.767 ms
index clone (memory)            2.036 ms
index clone (disk)              0.010 ms

base16 iterating 1M bytes       0.853 ms
index insertion                 4.512 ms
index flush                     4.717 ms
index lookup (memory)           2.907 ms
index lookup (disk)             1.755 ms
index clone (memory)            1.856 ms
index clone (disk)              0.010 ms

base16 iterating 1M bytes       1.525 ms
index insertion                 4.577 ms
index flush                     4.901 ms
index lookup (memory)           2.800 ms
index lookup (disk)             1.790 ms
index clone (memory)            1.794 ms
index clone (disk)              0.010 ms

base16 iterating 1M bytes       0.768 ms
index insertion                 4.486 ms
index flush                     4.918 ms
index lookup (memory)           2.658 ms
index lookup (disk)             1.721 ms
index clone (memory)            1.763 ms
index clone (disk)              0.010 ms

base16 iterating 1M bytes       0.732 ms
index insertion                 4.489 ms
index flush                     4.792 ms
index lookup (memory)           2.689 ms
index lookup (disk)             1.739 ms
index clone (memory)            1.850 ms
index clone (disk)              0.009 ms

base16 iterating 1M bytes       1.124 ms
index insertion                 7.188 ms
index flush                     4.888 ms
index lookup (memory)           2.829 ms
index lookup (disk)             1.609 ms
index clone (memory)            2.642 ms
index clone (disk)              0.010 ms

base16 iterating 1M bytes       1.055 ms
index insertion                 4.683 ms
index flush                     4.996 ms
index lookup (memory)           2.782 ms
index lookup (disk)             1.710 ms
index clone (memory)            1.802 ms
index clone (disk)              0.009 ms
```

Reviewed By: DurhamG

Differential Revision: D7440249

fbshipit-source-id: 0f946ab184455acd40c5a38cf46ff94d9e3755c8
2018-04-13 21:51:42 -07:00
Jun Wu
f9fb60337a minibench: add a simple library to do benchmark
Summary:
It's sad to find that existing Rust benchmark frameworks do not fit well in
our simple benchmark purpose. The benchmark library shipped with Rust [1] has
been in "nightly-only" for long. Third-party choices like "criterion.rs" does
too many things and misses certain small features. Namely, indexedlog wants:

  - More stable benchmark result. This means not picking the average time,
    but the "best" time among all runs, like what Mercurial does.
  - Do not measure setup cost from repetitive runs. As in D7404532, do not
    clone the index, and do not have separate "clone" benchmarks.
  - Faster benchmarks. This means getting rid of unused parts like calling
    gnuplot.

Besides, having the test framework to be lightweight also helps compilation
time. Looking at `indexedlog`'s dependencies (with unused "failure"
removed), 70% of them are from `criterion.rs`.

```
indexedlog v0.1.0 (lib/indexedlog)
[dependencies]
|-- atomicwrites v0.1.5
|   [dependencies]
|   |-- nix v0.9.0
|   |   [dependencies]
|   |   |-- bitflags v0.9.1
|   |   |-- cfg-if v0.1.2
|   |   |-- libc v0.2.39
|   |   `-- void v1.0.2
|   `-- tempdir v0.3.6
|       [dependencies]
|       |-- rand v0.4.2
|       |   [dependencies]
|       |   `-- libc v0.2.39 (*)
|       `-- remove_dir_all v0.3.0
|           [dependencies]
|           |-- kernel32-sys v0.2.2
|           |   [dependencies]
|           |   `-- winapi v0.2.8
|           |   [build-dependencies]
|           |   `-- winapi-build v0.1.1
|           `-- winapi v0.2.8 (*)
|-- byteorder v1.2.1
|-- fs2 v0.4.3
|   [dependencies]
|   `-- libc v0.2.39 (*)
|-- memmap v0.6.2
|   [dependencies]
|   `-- libc v0.2.39 (*)
|-- twox-hash v1.1.0
|   [dependencies]
|   `-- rand v0.3.22
|       [dependencies]
|       |-- libc v0.2.39 (*)
|       `-- rand v0.4.2 (*)
`-- vlqencoding v0.1.0 (lib/vlqencoding)
[dev-dependencies]
|-- criterion v0.2.1
|   [dependencies]
|   |-- atty v0.2.8
|   |   [dependencies]
|   |   `-- libc v0.2.39 (*)
|   |-- clap v2.31.1
|   |   [dependencies]
|   |   |-- ansi_term v0.11.0
|   |   |-- atty v0.2.8 (*)
|   |   |-- bitflags v1.0.1
|   |   |-- strsim v0.7.0
|   |   |-- textwrap v0.9.0
|   |   |   [dependencies]
|   |   |   `-- unicode-width v0.1.4
|   |   |-- unicode-width v0.1.4 (*)
|   |   `-- vec_map v0.8.0
|   |-- criterion-plot v0.2.1
|   |   [dependencies]
|   |   |-- byteorder v1.2.1 (*)
|   |   |-- cast v0.2.2
|   |   `-- itertools v0.7.7
|   |       [dependencies]
|   |       `-- either v1.4.0
|   |-- criterion-stats v0.2.1
|   |   [dependencies]
|   |   |-- cast v0.2.2 (*)
|   |   |-- num-traits v0.2.1
|   |   |-- num_cpus v1.8.0
|   |   |   [dependencies]
|   |   |   `-- libc v0.2.39 (*)
|   |   |-- rand v0.4.2 (*)
|   |   `-- thread-scoped v1.0.2
|   |-- failure v0.1.1
|   |   [dependencies]
|   |   |-- backtrace v0.3.5
|   |   |   [dependencies]
|   |   |   |-- backtrace-sys v0.1.16
|   |   |   |   [dependencies]
|   |   |   |   `-- libc v0.2.39 (*)
|   |   |   |   [build-dependencies]
|   |   |   |   `-- cc v1.0.8
|   |   |   |-- cfg-if v0.1.2 (*)
|   |   |   |-- libc v0.2.39 (*)
|   |   |   `-- rustc-demangle v0.1.7
|   |   `-- failure_derive v0.1.1
|   |       [dependencies]
|   |       |-- quote v0.3.15
|   |       |-- syn v0.11.11
|   |       |   [dependencies]
|   |       |   |-- quote v0.3.15 (*)
|   |       |   |-- synom v0.11.3
|   |       |   |   [dependencies]
|   |       |   |   `-- unicode-xid v0.0.4
|   |       |   `-- unicode-xid v0.0.4 (*)
|   |       `-- synstructure v0.6.1
|   |           [dependencies]
|   |           |-- quote v0.3.15 (*)
|   |           `-- syn v0.11.11 (*)
|   |-- failure_derive v0.1.1 (*)
|   |-- handlebars v0.31.0
|   |   [dependencies]
|   |   |-- lazy_static v1.0.0
|   |   |-- log v0.4.1
|   |   |   [dependencies]
|   |   |   `-- cfg-if v0.1.2 (*)
|   |   |-- pest v1.0.6
|   |   |-- pest_derive v1.0.6
|   |   |   [dependencies]
|   |   |   |-- pest v1.0.6 (*)
|   |   |   |-- quote v0.3.15 (*)
|   |   |   `-- syn v0.11.11 (*)
|   |   |-- quick-error v1.2.1
|   |   |-- regex v0.2.10
|   |   |   [dependencies]
|   |   |   |-- aho-corasick v0.6.4
|   |   |   |   [dependencies]
|   |   |   |   `-- memchr v2.0.1
|   |   |   |       [dependencies]
|   |   |   |       `-- libc v0.2.39 (*)
|   |   |   |-- memchr v2.0.1 (*)
|   |   |   |-- regex-syntax v0.5.3
|   |   |   |   [dependencies]
|   |   |   |   `-- ucd-util v0.1.1
|   |   |   |-- thread_local v0.3.5
|   |   |   |   [dependencies]
|   |   |   |   |-- lazy_static v1.0.0 (*)
|   |   |   |   `-- unreachable v1.0.0
|   |   |   |       [dependencies]
|   |   |   |       `-- void v1.0.2 (*)
|   |   |   `-- utf8-ranges v1.0.0
|   |   |-- serde v1.0.33
|   |   `-- serde_json v1.0.11
|   |       [dependencies]
|   |       |-- dtoa v0.4.2
|   |       |-- itoa v0.3.4
|   |       |-- num-traits v0.2.1 (*)
|   |       `-- serde v1.0.33 (*)
|   |-- itertools v0.7.7 (*)
|   |-- itertools-num v0.1.1
|   |   [dependencies]
|   |   `-- num-traits v0.1.43
|   |       [dependencies]
|   |       `-- num-traits v0.2.1 (*)
|   |-- log v0.4.1 (*)
|   |-- serde v1.0.33 (*)
|   |-- serde_derive v1.0.33
|   |   [dependencies]
|   |   |-- proc-macro2 v0.2.3
|   |   |   [dependencies]
|   |   |   `-- unicode-xid v0.1.0
|   |   |-- quote v0.4.2
|   |   |   [dependencies]
|   |   |   `-- proc-macro2 v0.2.3 (*)
|   |   |-- serde_derive_internals v0.21.0
|   |   |   [dependencies]
|   |   |   |-- proc-macro2 v0.2.3 (*)
|   |   |   `-- syn v0.12.14
|   |   |       [dependencies]
|   |   |       |-- proc-macro2 v0.2.3 (*)
|   |   |       |-- quote v0.4.2 (*)
|   |   |       `-- unicode-xid v0.1.0 (*)
|   |   `-- syn v0.12.14 (*)
|   |-- serde_json v1.0.11 (*)
|   `-- simplelog v0.5.0
|       [dependencies]
|       |-- chrono v0.4.0
|       |   [dependencies]
|       |   |-- num v0.1.42
|       |   |   [dependencies]
|       |   |   |-- num-integer v0.1.36
|       |   |   |   [dependencies]
|       |   |   |   `-- num-traits v0.2.1 (*)
|       |   |   |-- num-iter v0.1.35
|       |   |   |   [dependencies]
|       |   |   |   |-- num-integer v0.1.36 (*)
|       |   |   |   `-- num-traits v0.2.1 (*)
|       |   |   `-- num-traits v0.2.1 (*)
|       |   `-- time v0.1.39
|       |       [dependencies]
|       |       `-- libc v0.2.39 (*)
|       |       [dev-dependencies]
|       |       `-- winapi v0.3.4
|       |-- log v0.4.1 (*)
|       `-- term v0.4.6
|-- quickcheck v0.6.2
|   [dependencies]
|   |-- env_logger v0.5.6
|   |   [dependencies]
|   |   |-- atty v0.2.8 (*)
|   |   |-- humantime v1.1.1
|   |   |   [dependencies]
|   |   |   `-- quick-error v1.2.1 (*)
|   |   |-- log v0.4.1 (*)
|   |   |-- regex v0.2.10 (*)
|   |   `-- termcolor v0.3.5
|   |-- log v0.4.1 (*)
|   `-- rand v0.4.2 (*)
|-- rand v0.4.2 (*)
`-- tempdir v0.3.6 (*)
```

[1]: https://github.com/rust-lang/rust/issues/29553

Reviewed By: DurhamG

Differential Revision: D7440254

fbshipit-source-id: 53cdbd470945388db96702ab771a3f73b456da37
2018-04-13 21:51:42 -07:00
Jun Wu
8bcff92cab indexedlog: use a dedicated map type for offset translation
Summary:
The dirty -> non-dirty offset mapping can be optimized using a dedicated
"map" type that is backed by `vec`s, because dirty offsets are continuous
per type.

This makes "flush" significantly faster:

```
index flush             time:   [5.8808 ms 6.1800 ms 6.4813 ms]
                        change: [-62.250% -59.481% -56.325%] (p = 0.00 < 0.05)
                        Performance has improved.
```

Reviewed By: DurhamG

Differential Revision: D7422832

fbshipit-source-id: 9ab8a70d1663155941dae5b4f02f7452f5e3cadf
2018-04-13 21:51:42 -07:00
Jun Wu
00503a6d94 indexedlog: avoid a memory allocation
Summary:
It seems to improve the performance a bit:

```
index insertion         time:   [5.4643 ms 5.6818 ms 5.9188 ms]
                        change: [-24.526% -17.384% -10.315%] (p = 0.00 < 0.05)
                        Performance has improved.
```

Reviewed By: DurhamG

Differential Revision: D7422831

fbshipit-source-id: fc1c72f402258db7e189cd8724583757d48affb7
2018-04-13 21:51:42 -07:00
Jun Wu
4cb2cc1abb indexedlog: use Box<[u8]> instead of Vec<u8>
Summary:
For key entries, the key is immutable once stored. So just use `Box<[u8]>`.
It saves a `usize` per entry. On 64-bit platform, that's a lot.

Performance is slightly improved and it catches up with D7404532 before
typed offset refactoring now:

  index insertion         time:   [6.1852 ms 6.6598 ms 7.2433 ms]
  index flush             time:   [15.814 ms 16.538 ms 17.235 ms]
  index lookup (memory)   time:   [3.7636 ms 3.9403 ms 4.1424 ms]
  index lookup (disk)     time:   [1.9413 ms 2.0366 ms 2.1325 ms]
  index clone (memory)    time:   [2.6952 ms 2.9221 ms 3.0968 ms]
  index clone (disk)      time:   [5.0296 us 5.2862 us 5.5629 us]

Reviewed By: DurhamG

Differential Revision: D7422837

fbshipit-source-id: 4aabfdc028aefb8e796803e103f0b2e4965f84e6
2018-04-13 21:51:42 -07:00
Jun Wu
36793b7c14 indexedlog: simplify insert_advanced API
Summary:
Previously, both `value` and `link` are optional in `insert_advanced`.
This diff makes `value` required.

`maybe_create_link_entry` becomes unused and removed.

No visible performance change.

Reviewed By: DurhamG

Differential Revision: D7422838

fbshipit-source-id: 8d7d3cc1cc325f6fea7e8ce996d0a43d3ee49839
2018-04-13 21:51:41 -07:00
Phil Cohen
8f2a8be437 perfsuite: add --print and --use-profile
Summary:
Also add an IMM test to tease out working-copy vs. non-working-copy issues.

Also add some newlines to code stolen from fbcode.

Reviewed By: DurhamG

Differential Revision: D7432333

fbshipit-source-id: 029ccd8aeec7f0e2c380da41e7d78b433a275af3
2018-04-13 21:51:41 -07:00
Jun Wu
892fcd6dfd indexedlog: use typed offsets
Summary:
This is a large refactoring that replaces `u64` offsets with strong typed
ones.

Tests about serialization are removed since they generate illegal data that
cannot pass type check.

It seems to slow down the code a bit, comparing with D7404532. But there are
still room to improve.

  index insertion         time:   [6.9395 ms 7.3863 ms 7.7620 ms]
  index flush             time:   [15.949 ms 17.965 ms 20.246 ms]
  index lookup (memory)   time:   [3.6212 ms 3.8855 ms 4.1923 ms]
  index lookup (disk)     time:   [2.2496 ms 2.4649 ms 2.8090 ms]
  index clone (memory)    time:   [2.7292 ms 2.9399 ms 3.2055 ms]
  index clone (disk)      time:   [4.9239 us 5.5928 us 6.3167 us]

Reviewed By: DurhamG

Differential Revision: D7422833

fbshipit-source-id: 7357cb0f4f573f620e829c5e300cd423619dbd62
2018-04-13 21:51:41 -07:00
Jun Wu
b5cd2be169 fsmonitor: ignore errors when calculating update distance
Summary:
I got errors running `histedit --abort` in the code path:

  quark [1] % hg histedit --abort
  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
  saved backup bundle to .hg/strip-backup/098a5bf950b2-78da64d6-backup.hg
  saved backup bundle to .hg/strip-backup/accd7866dec2-599f621b-backup.hg
  Traceback (most recent call last):
    File "/usr/lib/python2.7/site-packages/mercurial/scmutil.py", line 159, in callcatch
      return func()
    File "/usr/lib/python2.7/site-packages/mercurial/dispatch.py", line 340, in _runcatchfunc
      return _dispatch(req)
    File "/usr/lib/python2.7/site-packages/mercurial/dispatch.py", line 944, in _dispatch
      cmdpats, cmdoptions)
    File "/usr/lib/python2.7/site-packages/hgext/remotefilelog/__init__.py", line 458, in runcommand
      return orig(lui, repo, *args, **kwargs)
    File "/usr/lib/python2.7/site-packages/hgext/journal.py", line 85, in runcommand
      return orig(lui, repo, cmd, fullargs, *args)
    File "/usr/lib/python2.7/site-packages/hgext/undo.py", line 118, in _runcommandwrapper
      result = orig(lui, repo, cmd, fullargs, *args)
    File "/usr/lib/python2.7/site-packages/hgext/fastmanifest/__init__.py", line 202, in _logonexit
      r = orig(ui, repo, cmd, fullargs, *args)
    File "/usr/lib/python2.7/site-packages/hgext/perftweaks.py", line 326, in _tracksparseprofiles
      res = runcommand(lui, repo, *args)
    File "/usr/lib/python2.7/site-packages/hgext/perftweaks.py", line 313, in _trackdirstatesizes
      res = runcommand(lui, repo, *args)
    File "/usr/lib/python2.7/site-packages/hgext/fbamend/hiddenoverride.py", line 92, in runcommand
      result = orig(lui, repo, cmd, fullargs, *args)
    File "/usr/lib/python2.7/site-packages/mercurial/dispatch.py", line 693, in runcommand
      ret = _runcommand(ui, options, cmd, d)
    File "/usr/lib/python2.7/site-packages/mercurial/dispatch.py", line 952, in _runcommand
      return cmdfunc()
    File "/usr/lib/python2.7/site-packages/mercurial/dispatch.py", line 941, in <lambda>
      d = lambda: util.checksignature(func)(ui, *args, **strcmdopt)
    File "/usr/lib/python2.7/site-packages/hgext/remotenames.py", line 698, in exhistedit
      ret = orig(ui, repo, *args, **opts)
    File "/usr/lib/python2.7/site-packages/hgext/histedit.py", line 1033, in histedit
      release(state.lock, state.wlock)
    File "/usr/lib/python2.7/site-packages/mercurial/lock.py", line 329, in release
      lock.release()
    File "/usr/lib/python2.7/site-packages/mercurial/lock.py", line 311, in release
      self.releasefn()
    File "/usr/lib/python2.7/site-packages/hgext/fsmonitor/__init__.py", line 878, in staterelease
      l.stateupdate.exit()
    File "/usr/lib/python2.7/site-packages/hgext/fsmonitor/__init__.py", line 751, in exit
      self.repo, self.oldnode, self.newnode)
    File "/usr/lib/python2.7/site-packages/hgext/fsmonitor/__init__.py", line 787, in calcdistance
      anc = repo.changelog.ancestor(oldnode, newnode)
    File "/usr/lib/python2.7/site-packages/mercurial/revlog.py", line 1139, in ancestor
      a, b = self.rev(a), self.rev(b)
    File "/usr/lib/python2.7/site-packages/mercurial/changelog.py", line 360, in rev
      r = super(changelog, self).rev(node)
    File "/usr/lib/python2.7/site-packages/mercurial/revlog.py", line 559, in rev
      raise LookupError(node, self.indexfile, _('no node'))
  LookupError: 00changelog.i@67b8abf62104: no node
  abort: 00changelog.i@67b8abf62104: no node!

In case real strip happens, and we are in some nested "update state" situation, the
nodes might be gone:

  enter lock state with src: node1, dest: node2
    real strip node1 or node2 <- this is possible
  exit lock state - node2 is missing

Reviewed By: wez

Differential Revision: D7462306

fbshipit-source-id: 133252583519bc6916a00df4a4a82b36591fb8a5
2018-04-13 21:51:41 -07:00
Phil Cohen
9262397c0b add progress.spinner around fsmonitor walk
Summary:
When the fsmonitor state gets invalidated and a full repo walk happens (e.g. after `rebase --abort`), it's a very frustrating experience because Mercurial appears to hang without any output until the walk is done (1-2min for my www clone).

As we did for "querying watchman", let's add a spinner here so people know what's going on, other than "Mercurial is slow". It also will make it easier to claim impact to whoever helps fix our fsmonitor invalidation story[1], since they can tell users "I fixed the 'full repo walk' issue".

Reviewed By: DurhamG, quark-zju

Differential Revision: D7448662

fbshipit-source-id: 011eb742bf388328b3cceb2762681b4f2e9a4eb1
2018-04-13 21:51:41 -07:00
Durham Goode
5adb6167db hg: handle server treemanifest lookup errors more gracefully
Summary:
Previously we would throw an unhandled exception which would cause
stderr output on the client. Since we sometimes want to try the server silently
from the client, let's return errors as a bundle2 part that we can handle more
gracefully.

Reviewed By: quark-zju

Differential Revision: D7448101

fbshipit-source-id: f0c5d0af56e718f0403ed9a18d66ad8be150d5b8
2018-04-13 21:51:41 -07:00
Durham Goode
c51ac43fe6 hg: switch metadatastore to MissingNodesError
Summary:
Data stores have already migrated to MissingNodesError instead of
KeyError, so let's move metadatastore as well. This provides better error
messages and more specific catching.

Reviewed By: phillco

Differential Revision: D7448103

fbshipit-source-id: 33d0f267545abd7d4063d2b344a93d26aff76d81
2018-04-13 21:51:41 -07:00
Durham Goode
9c3ff85ad8 hg: move bundle2 error part creation to a helper
Summary:
Since error parts pass the message and hint as parameters instead of
payload, they are limited to 255 characters. Let's add a helper function to
enforce this.

Reviewed By: quark-zju

Differential Revision: D7448104

fbshipit-source-id: 33d47a21e7159b6c4bd72cad9669568b92a51e34
2018-04-13 21:51:41 -07:00
Phil Cohen
add6dcc39c overlayworkingfilectx: properly rebase flag-only changes
Summary:
In an in-memory merge, if a commit only changed the flags of a file, and that file also never got written to during the merge, the IMM could fail and cause it to restart.

The reason is pretty simple: `setflags()` sets `cache[flags]` but not `cache[data]`, as it doesn't have any new data to store. In that case, calls to read the data should to fall-through to the underlying `p1` context.

Indeed, proper logic to do that already exists in `overlayworkingctx.data(path)` and `flags(path)`. The problem is that `tomemctx()` was reading from the cache directly, which is problematic and unhygenic. So let's just change it to call the proper functions, which also fixes the bug.

Reviewed By: DurhamG

Differential Revision: D7447640

fbshipit-source-id: 1625ef82ad2683c6a72059a0944fd5e336d3ec3a
2018-04-13 21:51:41 -07:00