Summary:
[commitcloud] commit cloud recover state
`hg cloudrecover` command
It might be helpful to have a command like this in cases something goes wrong
with the local state
Reviewed By: DurhamG
Differential Revision: D7417147
fbshipit-source-id: 4b236f2753b1f212ff4881a649032e53e032c66c
Summary:
When pushing a backup bundle to the server, check if the response contains an
error, and fail the backup accordingly.
Differential Revision: D7498324
fbshipit-source-id: a08807ac54e9d3044ff1450e93d2a8ea9d6f767f
Summary:
Add a server-side config option `infinitepush.maxbundlesize` to control the
maximum bundle size (currently 100MB).
Add a test that shows bad behaviour when pushing backups that exceed this size.
Differential Revision: D7498323
fbshipit-source-id: 640478e7a58cb3c39408fe2a24d8d581f14d891c
Summary: Since we now have the ability to store multiple values. Add a test.
Reviewed By: DurhamG
Differential Revision: D7472880
fbshipit-source-id: 85b1c69245ac7f0c4702daf22a02f5e5072f0924
Summary:
The value type is a linked list of u64 integers. Add an API to expose that.
Using iterator framework has benefits about flexibility - the caller can
take the first value, or convert it to a vector, or count the values, etc.
easily.
Reviewed By: DurhamG
Differential Revision: D7472881
fbshipit-source-id: d31e81770e069734b54fa08729c0cd45a699aae2
Summary:
This is caught by a later test. Looking up a non-existed child (jumptable
value is 0) returns InvalidData error, while it should return Offset(0).
The added if condition does not seem to have noticeable performance impact:
index insertion 3.840 ms
index flush 3.740 ms
index lookup (memory) 1.085 ms
index lookup (disk, no verify) 1.972 ms
index lookup (disk, verified) 7.752 ms
Reviewed By: DurhamG
Differential Revision: D7472882
fbshipit-source-id: 1cc51e9afa248e123cca9c561d7bb2128fd898b1
Summary:
Previously, the code was focusing on getting the hardest (index) part right,
but less about the value part. There is no way to get all values in the
linked list, as designed, yet. This diff starts the work.
Similar to `KeyOffset::key_and_link_offset`, change the internal API of
LinkOffset to return both value and the next link offset.
Reviewed By: DurhamG
Differential Revision: D7472879
fbshipit-source-id: 4a4512d7c63abbb667146de582e0f8cd04c9c04a
Summary:
`Index::open` now takes too many parameters, which is not very convenient to
use. Inspired by `fs::OpenOptions`, use a dedicated strut for specifying
open options.
Motivation: To test checksum ability more confidently, I'd like to write
something that randomly mutates 1 byte from a sane index. To make sure the
checksum coverage is "correct", checksum chunk size is another parameter.
Reviewed By: DurhamG
Differential Revision: D7464182
fbshipit-source-id: 469ce7d1cfa5de3946028418567a9f3e2bc303fa
Summary:
Address DurhamG's review comment on D7422832.
Previously, `OffsetMap::get` expects a dirty offset. That's because it was
changed from `HashMap` and we don't control `HashMap::get`. It's cleaner to
let `OffsetMap` do the `is_dirty` check.
Reviewed By: DurhamG
Differential Revision: D7461707
fbshipit-source-id: 9f2abdf6c6f993d98d9443f16bafcc6154ee0dbb
Summary:
The new test covers the `else` branch inside `LeafOffset::set_link`
previously not covered.
Coverage was checked by the following script:
```
from __future__ import absolute_import
import glob
import os
import shutil
os.system('cargo rustc --lib --profile test -- -Ccodegen-units=1 -Clink-dead-code -Zno-landing-pads')
path = max((os.stat(path).st_mtime, path) for path in glob.glob('./target/debug/*-????????????????'))[1]
shutil.rmtree('target/kcov')
os.system('kcov --include-path $PWD/src --verify target/kcov %s' % path)
```
Reviewed By: DurhamG
Differential Revision: D7446902
fbshipit-source-id: 293da2ff53b83c8f11534f0f8e5e7fd102216a01
Summary:
Change `insert_advanced` to accept an enum that could be either a key, or an
(offset, len) that refers to the external key buffer.
Insertion becomes slower due to new flexibility overhead. For some reason,
"index lookup (no verify)" becomes faster (restores pre-D7440248 performance):
index insertion 6.434 ms
index flush 3.757 ms
index lookup (memory) 1.068 ms
index lookup (disk, no verify) 1.969 ms
index lookup (disk, verified) 7.805 ms
With 2M 20-byte keys, the non-external key version generates a 105MB index:
seconds operation
1.247 insert
0.622 flush
1.859 flush done
0.702 lookup (without checksum)
1.395 lookup (with checksum)
Using external keys,the index is 70MB, and time for each operation:
seconds operation
1.086 insert
0.702 flush
0.665 lookup (without checksums)
1.602 lookup (with checksums)
The external key will have more space wins for longer keys, ex. file path.
`Index` module was made public so `InsertKey` type is usable.
Reviewed By: DurhamG
Differential Revision: D7444907
fbshipit-source-id: b89d95246845799c2c55fb73ad203a7e6724b85e
Summary:
Previously, a leaf entry can only have a `KeyOffset`. This diff makes it
possible to be either `KeyOffset`, or `ExtKeyOffset`. The API didn't change
much since `LeafOffset::key_and_link_offset` handles the difference
transparently.
Latest benchmark result:
index insertion 4.879 ms
index flush 3.620 ms
index lookup (memory) 1.827 ms
index lookup (disk, no verify) 3.508 ms
index lookup (disk, verified) 7.861 ms
Reviewed By: DurhamG
Differential Revision: D7444909
fbshipit-source-id: 5441e1ae187d42931377d7213dcb77156b2af714
Summary:
The leaf entry has a `key_and_link_offset` method. Previously it returns a
`KeyOffset`, since we now have `ExtKeyOffset`, it's friendly to handle the
key entry type difference at the leaf entry level, instead of requiring the
caller to handle it.
Reviewed By: DurhamG
Differential Revision: D7444905
fbshipit-source-id: 56d87641a2a5a50ddca8b1e4c74c9aaa3891b542
Summary:
Previously, I thought there is only one index that will use "commit hash" as
keys, that is the nodemap, and other indexes (like childmap) would just use
shorter integer keys (ex. revision number, or offsets). So the space overhead
of storing full keys only applies to one index and seems acceptable.
But that implies strict topo order for the source of truth data (ex. to use
integers as keys in childmap, you have to know how to translate parent
revisions from hashes to integers at the time writing the revision).
Thinking about it again, it seems the topo-order requirement would make a lot
of things less flexible. It's much easier to just use hashes as keys in the
index. Then it's worthwhile to address the space efficiency problem by
introducing an "external key buffer" concept. That's actually what `radixbuf`
does.
This is the start. It adds the type to the strcut. The feature is not completed
yet.
Reviewed By: DurhamG
Differential Revision: D7444904
fbshipit-source-id: 60a83c9e6e8b0734450f0c5827928a7c5bd111d5
Summary:
A subsequent diff will need access to the node's diff and meta during iteration time. It seems like a
natural part of the API so let's add it.
Note: It's possible to call `.getdelta(name, node)` to get this data if we don't read it here.
But I ran into some weird occassional OSErrors from the mmap API when I did that. So let's just
do this.
Reviewed By: DurhamG
Differential Revision: D7369225
fbshipit-source-id: 252839a549242909153c74287db8f36d6c63bd9c
Summary:
This makes hg pull use the connectionpool. This means prefetches can
reuse the existing ssh connection when appropriate. This both speeds up
prefetches, and also means they will speak to the same server that served the
pull.
Reviewed By: ryanmce
Differential Revision: D7481107
fbshipit-source-id: f9a3670527cb7e8956029c86d50d8e030dd3cc01
Summary:
Previously the connectionpool was a remotefilelog specific concept. We
want to start sharing connections between pull and prefetches, so let's move it
to core Mercurial.
Reviewed By: ryanmce, phillco
Differential Revision: D7480670
fbshipit-source-id: 1b2eff3b0e61a815709ffaec35df802eeda0c24b
Summary:
`hg debugcolor --style` shows the component parts of each style individually,
however this doesn't work if the styles are defined as the new fallback styles
(separated by colons). This is because the fallback is only implemented for
actual style names - it doesn't work for `ui.label('brightblue:blue', 'text')`.
It's usefule to see what the fallbacks are (even if they're not necessary on
your own system), so change debugcolor to split the elements of the fallback
style and show them separately.
Reviewed By: quark-zju
Differential Revision: D7485545
fbshipit-source-id: dce7204c9f0a98bb730b3ba864db28a9ec52a339
Summary:
`len()` on a hybrid manifest wrapping a treemanifest would raise an attribute error. But if there is no treemanifest or there is *only* a treemanifest, then a TypeError is raised. Using `len()` on an object that doesn't support length should always raise `TypeError`, consistently.
Instead of looking up the `__len__` attribute, use the built-in `len()` function, which will raise `TypeError` if the wrapped manifest in a hybrid doesn't have a `__len__` method. This ensures that we get a consistent exception.
Reviewed By: farnz
Differential Revision: D7485510
fbshipit-source-id: 4132d6b383171cde8dd99dd60098716d4aedc527
Summary:
The full command line needs to come from the `dispatch.runcommand` function, as
`sys.argv` contains `serve ...` for chg invocations.
Also make sure the correlator remains the same for commands that make multiple
connections to the server.
Reviewed By: quark-zju
Differential Revision: D7443727
fbshipit-source-id: a785e372b7b67fbd0b4ab4d73e7ff914aa5db9c3
Summary:
`remotefilelog.fileserverclient.peersetup.remotefilepeer` overrides the
`_callstream` method, however it uses `command` rather than `cmd` for the first
parameter name. This doesn't match the method it's overriding, and clashes
with clienttelemetry's use of this parameter for the original command that the
user ran.
Make this method match all the others.
Reviewed By: quark-zju
Differential Revision: D7443726
fbshipit-source-id: 1170feb21056c3e044bffaf55d95f7c48ff972fb
Summary:
gitignore could have performance issues stating .gitignore files everywhere.
That happens if watchman returns O(working copy) files. Add a config to
disable it as we're finding solutions.
Reviewed By: DurhamG
Differential Revision: D7482499
fbshipit-source-id: 4c9247b0318bf034c8e9af4b74c21110cc598714
Summary:
Turns out I incorrectly assessed this situation before. We do use content from
perforce servers a lot. This change makes p4seqimport read from local disk
directly if possibel rather than resorting solely on `p4 print` to obtain file content.
```name=Checking file content src on master-importer task 0 (running for 15h+)
[15:40:23 twsvcscm@priv_global/independent_devinfra/ovrsource-master-importer/0 ~]$ egrep -o 'src: (gzip|rcs|p4)' /logs/stdout | sort | uniq -c
2567 src: gzip
24 src: p4
```
Differential Revision: D7388797
fbshipit-source-id: 5fe1a525bc211d64a75954d529edc152d22970a7
Summary:
Subsequent commits will need the new path of a mutable{data, hist}pack -- this makes
that data accessible.
Reviewed By: DurhamG
Differential Revision: D7369226
fbshipit-source-id: f6849aaed747fbd9afee7191e6a0e5e1357ca618
Summary: fastmanifest used the statvfs function to be smart about how much disk space it used. That function isn't available on windows though. This optimization is optional, and we probably won't end up using the fastmanifest cache on windows anyways, so let's just skip it if its not available.
Reviewed By: quark-zju
Differential Revision: D7478478
fbshipit-source-id: e9595f3fef397d66d76f3ecfa54f8e4328ce0921
Summary:
dsp had a look at the whole stack and suggested some changes:
* Only write bookmark once at the end of the import - we are doing a single transaction anyways so updating the bookmark after every changelist import is moot
* Remove unused function seqimporter.ChangelistImporter._safe_open
* Require fncache to preserve behavior from p4fastimport
Differential Revision: D7375481
fbshipit-source-id: f4407d5d0276f96d72bf67544091640fe1c46044
Summary: Updates the importer wrapper to use the new p4seqimport, replacing p4fastimport.
Differential Revision: D7326764
fbshipit-source-id: 588486bfd747086396f47e678da05c6eafd30565
Summary:
When testing p4seqimport with remotefilelog it would barf on call to `.tip()`,
because remotefilelog doesn't have that.
This change makes use of the change context from the repo instead to get the
tip node.
Differential Revision: D7294979
fbshipit-source-id: 18b4a5107f4cbf676016d44d5134bf0d252eeff3
Summary:
Testing that p4seqimport works properly for branching
Based on comment on D7172867
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7203765
fbshipit-source-id: 2e328f5b43fc47a60bfe2c41f9454f8471dda814
Summary:
Perforce supports RCS keyworded files, more info here:
http://answers.perforce.com/articles/KB/3482
We replace things back in p4fastimport, this replicates the behavior in
p4seqimport (unit test should clarify what this means)
Differential Revision: D7188163
fbshipit-source-id: 594f71d6114c73001753ae36c4973c2db3310e62
Summary:
Respect the executable bit on files based on perforce type.
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7185388
fbshipit-source-id: 59afec7bd857572b8347ebe546d131017a79928c
Summary:
p4seqimport has used very high level mercurial abstractions so far (almost
equivalent to running hg add / mv / rm / commit on command line). This is very
easy to grasp as we use it day to day. It is not performant enough for our
importer:
- It does the work twice (write to working copy, then commit changing hg metadata)
- It requires the working copy (this would force us to update between revs,
materializing a prohibitively large number of files)
This change makes use of memctx, which is basically an in-memory commit. This way
we don't need a working copy and we save time + a lot of space.
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7176903
fbshipit-source-id: 2773d7c001b615837496ea9db3229d9afc020124
Summary:
p4seqimport has a bookmark option, it was completely ignored before this change.
This makes use of the opt, moving the bookmark as we import changes.
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7172867
fbshipit-source-id: be63765088b0583df2e1c9e0ccec869c5278d782
Summary:
Properly create files as symlinks if they are symlinks in P4
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Reviewed By: wlis
Differential Revision: D7157772
fbshipit-source-id: ac3e5010f3d15460592a449c817824c0b28a8435
Summary:
Similar to #10 (D7113181), we need to track large files.
This change adds the bits to do so, reusing the logic from p4fastimport which was
moved to lfs.py
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7115654
fbshipit-source-id: 56ccfadf6fa14dcfb8005cc5ef03fb175835bcda
Summary:
This change makes seqimport write revision info (i.e. (CL, hghash) pairs) to a
sqlite file. This is used by the importer TW job wrapper to write the info
into `xdb.p4sync` table `revmap`
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7113181
fbshipit-source-id: e55a8cf0b794216a4855ae7486885c3d956cd7fb
Summary:
Adds p4changelist to commit extra info
With p4changelist info, make p4seqimport incremental
Add debug message to have more accurate info on what is actually being imported
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7090090
fbshipit-source-id: 17529aa57452453cfe29c3c3dc9d9e7daa8cffb2
Summary:
Adds copy tracing to `p4seqimport` by:
- Leveraging `fromFile` from `p4 -ztag describe` to introduce source for moved
files into P4Changelist.load's
- Utilizing that info from P4 CL when creating hg commit
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7074892
fbshipit-source-id: e105a608bb953a8137ec6c9afc7e0571a902c868
Summary:
Consolidates manipulation of p4 CL info into p4 module, pulling the relevant code
out of ChangeManifestImporter creategen so it can be easily shared by
p4fastimport and p4seqimport
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7064179
fbshipit-source-id: 72c5bcad209eebf40ec8152a07f98f7f7fa544fb
Summary:
Adds logic to create the commit, using info from p4 CL + the list of added and
removed files.
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7063983
fbshipit-source-id: c64e44c19d06e54fe35121a8d6128de050f93823
Summary:
Read file from perforce, write into the hg repo.
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7050157
fbshipit-source-id: 4389ba11f62c8ed825d6a6ef3c001095339eb551
Summary:
Creates ChangelistImporter, which will be responsible for translating a p4 CL to
a hg commit
For now it only goes through files touched by the CL and lists what was added or
removed. Next diffs will evolve it to the point where it effectively performs the
translation.
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7049961
fbshipit-source-id: 6a9f3bd57cadc2b9ea8a81373cc10dfda76311e7
Summary:
Pulls the logic to define changelists from p4fastimport into separate function
and re-uses that in p4seqimport
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7035674
fbshipit-source-id: 699e9148d35e437f306062f290c8ec2a857df480
Summary:
This change:
Moves some opts sanitizing logic into function `sanitizeopts`
Adds checks for `limit` being a positive integer
Uses `sanitizeopts` new function in p4seqimport
Adds a test covering `sanitizeopts`
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7035217
fbshipit-source-id: cd677fb254ff83d123673d51a1c682639de08a30
Summary:
p4seqimport will be the new command to import from p4 to hg changelist by
changelist. This should provide us with a more robust importer that doesn't rely
on fiddling with hg's data structures directly. p4fastimport was important to
create ovrsource from scratch and import thousands of changelists, but moving
forward it is probably safer and easier to understand/maintain something that is
based on higher level Mercurial APIs
All that said, this is the first change, this change:
1. Creates p4seqimport command as part of the p4fastimport extension
2. Refactors the p4 client checking logic into `enforce_p4_client_exists`
3. Adds a test that checks the new function works through using `p4seqimport`.
For a high-level overview of p4seqimport, please check https://our.intern.facebook.com/intern/wiki/IDI/p4seqimport/
Differential Revision: D7015941
fbshipit-source-id: cb5c59b2f104f336a078025544a44028bf01fa85
Summary:
After testing locally, I couldn't conclusively prove if rebasing a single change with IMM was any faster or slower than on disk.
Using IMM on the working copy will definitely be better for rebasing stacks, and it's just nicer to not have the working copy thrash around as much. It also might be interesting to (possibly) let you work while the rebase is running, too.* So I've added the code that will let us enable this more widely (as a subset of IMM) to experiment.
*I've made it so that if you make any changes during the rebase (causing the last update to fail), we just print a nice message telling you to checkout the new rebased working copy commit, instead of failing/aborting. TBD whether this is something we want to encourage people to do, however. I've kept the existing up-front check for uncommited changes when rebasing the WCP with IMM for now.
Reviewed By: DurhamG
Differential Revision: D7051282
fbshipit-source-id: c04302539021f481c17e47c23d3f4d8b3ed59db6