Commit Graph

14 Commits

Author SHA1 Message Date
David Soria Parra
f514db7c65 p4fastimport: support action archive
Summary:
Support the action archive. 'archive' means that a revision was
"archived" to a different depot. We must ensure we support the action correctly
in order to have a smooth import.

Test Plan: run it && rt test-p4* test-check*

Reviewers: #sourcecontrol, #idi, wlis

Reviewed By: wlis

Subscribers: wlis, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4980571

Signature: t1:4980571:1493676115:ad0d3748c52747aeb6427fd25ece9d4987886936
2017-05-01 20:43:11 -07:00
David Soria Parra
305a284a00 p4fastimport: generate filelogs using fstat concurrently
Summary:
Generating case-correct filelogs using fstat leads to O(changelists)
calls to Perforce (and overall complexity of O(changelists*number of files),
which is slow. We want to run this using workers.

Test Plan: rt test-p4* test-check*

Reviewers: #sourcecontrol, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4963767

Signature: t1:4963767:1493349047:3eaddf6a3bb2ee06decaac48980c69b8645ebbed
2017-05-01 20:43:11 -07:00
David Soria Parra
bd16b38823 p4fastimport: add --limit option to process N changelists per transaction
Summary:
When passing --limit we are processing N Perforce changelists at a time.
The goal is to provide savepoints for large imports.

Test Plan: rt test-p4* test-check*

Reviewers: #sourcecontrol, quark

Reviewed By: quark

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4980482

Signature: t1:4980482:1493688071:800a0bafda33a17cb2ef54c9f399db7055a8cbf9
2017-05-01 20:43:11 -07:00
David Soria Parra
fc6dc3be52 p4fastimport: adding debug message to filelog loading
Summary: Add debug output while loading filelogs when --debug is passed.

Test Plan: rt test-check* test-p4*

Reviewers: #sourcecontrol, #idi, ikostia

Reviewed By: ikostia

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4963656

Signature: t1:4963656:1493315731:f2b28dd06e4611d74e039d8b7805f72eb1bc16e4
2017-05-01 20:43:11 -07:00
David Soria Parra
63370254c8 p4fastimport: implement --bookmark
Summary:
Implement an option to set a bookmark after we imported successfully.
We used to try to calculate this for continuous imports in a wrapper around the
importer. However it's much easier to just do it inside the importer itself, in
particular when we add branch support later

Test Plan: rt test-p4*

Reviewers: #idi, #sourcecontrol, quark

Reviewed By: quark

Subscribers: quark, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4935717

Signature: t1:4935717:1493127835:262955b3288d9bd03ca08a45d7ec1667d786430a
2017-04-25 15:29:39 +01:00
David Soria Parra
4d406cac74 p4fastimport: use with statement
Summary:
Use the with statement to lock/unlock. This is more failsafe
and less error prone.

Test Plan: rt test-p4*

Reviewers: #sourcecontrol, #idi, mjpieters

Reviewed By: mjpieters

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4921212

Signature: t1:4921212:1492696169:8a7075068252063f140bedeb58ddf70e36a138f4
2017-04-25 15:29:39 +01:00
David Soria Parra
78dcff9f00 p4fastimport: support client view
Summary:
Perforce client support a view. A view maps a server side path to a client side
path, e.g.: the view '//depot/A/B/... //myclient/foo/...' maps every file in
//depot/A/B/ on the server side to a local checkout foo/ inside the root for
checkouts defined for the client 'myclient'.
We are using §p4 where§ to support these mappings. We do the mapping inside the
FileImporter at the moment as this runs nicely in parallel. It's a bit hacky
but get's the job done. We use this mostly to ommit the common prefix
//depot/... and remove branch indicators such as Main.

So in our case a view looks like

  //depot/Software/OculusSDK/PC/Main... //client/Software/OculusSDK/PC/...

resulting in a file

  //depot/Software/OculusSDK/PC/Main/test.txt

being imported as

  Software/OculusSDK/PC/test.xt

Test Plan: rt test-p4*

Reviewers: #sourcecontrol, #idi, ikostia

Reviewed By: ikostia

Subscribers: ikostia, durham, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4913483

Signature: t1:4913483:1492702356:b97b691343b8a1d52940445934730b31d411db4c
2017-04-25 15:29:39 +01:00
David Soria Parra
3799abce68 p4fastimport: write metadata to sqlite
Summary:
Similar to LFS, it can be useful to read the metadata about imported
commits after the import. We write the imported p4 changelist <> hg changeset
map to an sqlite database. This will allow wrapper scripts to easily allow
using it without having to traverse the full hg history. In particularly it
allows to check if new revs were imported.

Test Plan: rt test-p4*

Reviewers: #sourcecontrol, #idi, quark

Reviewed By: quark

Subscribers: quark, wlis, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4913476

Signature: t1:4913476:1493127257:b69191de034908f493e0c68fda3b56ff8703e2d4
2017-04-25 15:29:39 +01:00
David Soria Parra
2ab5a0baf8 p4fastimport: support for writing LFS metadata to sqlite
Summary:
We do not write the blobs to local cache anymore. We want our LFS
server to import them from Perforce directly or serve them from Perforce
directly. In order to do so, we need the correct mapping from oid to perforce
file + cl. This is generally useful metainformation that other LFS
implementation can use. We simple write the data to sqlite because it's simple
and built in.

Test Plan: rt test-p4*

Reviewers: #sourcecontrol, #idi, quark

Reviewed By: quark

Subscribers: quark, durham, wlis, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4913469

Signature: t1:4913469:1492796253:1e3b389c7cb0ba3acf9504410d267a1cf9651118
2017-04-25 15:29:39 +01:00
David Soria Parra
7d78fa39bf p4fastimport: initial support for writing lfs metadata
Summary:
Add a special mode to the importer that patches the LFS extension to
not write blobs to local disc. In our case we do have the files already in
Perforce and do not have to write them again to disk. This is currently breaking
verify and therefore we are patching verify.

Test Plan: rt test-p4*

Reviewers: #sourcecontrol, #idi, durham, quark

Reviewed By: quark

Subscribers: quark, durham, wlis, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4913455

Signature: t1:4913455:1492979419:204c1075376fe975ddea880b22e6984684e7ff25
2017-04-25 15:29:39 +01:00
David Soria Parra
44a0aabf25 p4fastimport: incremental imports
Summary:
Implement incremental imports.

1. Find the last imported perforce changelist.
2. Set startctx and use it in all importers
3. Import filelogs from their current position (we "should" add an additional check here, but we don't)
4. Import manifests and changelists.

Manifests are a bit tricky because we must obtain the original filelog revision
*BEFORE* we imported them, but manifest imports come "after". We could read the
most recent entry from manifests, but that won't cover the case in which files
are added. So instead we know the changelist that we are currently importing,
and looking for the rev with the correct linkrev in filelogs. That's a big ugly,
but it works. We could instead return the original offset from the worker and
pass it into the manifest importer, but I feel that is not much better and
evenutally more errorp rone.

Test Plan: cd tests && rt test-p4*

Reviewers: #sourcecontrol, durham

Reviewed By: durham

Subscribers: durham, mjpieters, #idi

Differential Revision: https://phabricator.intern.facebook.com/D4890110

Signature: t1:4890110:1492662991:0e141e62734e1224ac8e1c11f4e8794452455b18
2017-04-19 23:33:06 -07:00
David Soria Parra
11061394c4 p4fastimport: remove copytracing code
Summary: remove the unused copytracing code until we use it again.

Test Plan: rt test-p4*

Reviewers: #sourcecontrol, #idi, wlis

Reviewed By: wlis

Subscribers: wlis, mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4913446

Signature: t1:4913446:1492624958:f6c083f3c64352f0a8e1172a1d6c6338ee86cd24
2017-04-19 23:33:06 -07:00
David Soria Parra
4359d6f9ad p4fastimport: return filename from worker
Summary:
Return the filename from the worker so that we can later use it
for better messages and pass it correctly to progress().

Test Plan: cd tests && rt test-p4*

Reviewers: #sourcecontrol, durham, ikostia

Reviewed By: ikostia

Subscribers: ikostia, durham, mjpieters, #idi

Differential Revision: https://phabricator.intern.facebook.com/D4890079

Signature: t1:4890079:1492643347:10eb254ba99faf9e28107ede3ef78f1fcab7946a
2017-04-19 23:33:06 -07:00
David Soria Parra
ef08c10f5b p4fastimport : introducing fast Perforce to Mercurial convert extension
Summary:
`p4fastimport` is a fast convert extensions for Perforce to Mercurial. It
is designed to generate filelogs in parallel from Perforce. It tries to
minimize the use of Perforce commands and reads from the the Perforce
store on a Perforce server directly.

The core of p4fastimport is the idea to generate a Mercurial filelog
directly from the underlying Perforce data, as a Perforce file in most
cases matches a filelog directly (per-file branches is an exception). To
generate a filelog we are reading each file for an imported revision. A
file in Perforce is locally either stored in RCS, as a compressed GZIP
or as an flat file (binaries). If we do not find a version locally on
disk we fallback to downloading it from Perforce.

We are generating manifests after all filelogs are imported. A manifest
is constructed by adding and removing files from an initial state. We
are generating the correct offset from a manifest into the filelog by
keeping track of how often a file was touched.

We then generate the changelog.

Linkrev generation is a bit tricky. For every file in Perforce know
to which changelist it belongs, as it's stored revisions contains the
changelist. E.g.  1.1422 is the file changed in the changelist 1422 (this
refers to the "original" changelist, before a potential renumbering,
which is why we use the -O switch).  We use the CL number obtained
from the revision to reverse lookup the offset in the sorted list of
changelists, which corresponds to it's place in the changelog later,
and therefore it's correct linkrev.

Parallel imports: In order to run parallel imports we MUST keep one lock
at a time, even if we import multiple file logs at the same time. However
filelogs use a singular `fncache`, which will be corrupted if we generate
filelogs in parallel. To avoid this, repositories must be generated with
*fncache* disabled! This restricts `p4fastimport` with workers to run
only on case sensitive file systems.

Test Plan:
The included tests as well as multiple imports from a small testing
Perforce client. Afterwards successfully run `hg verify`

  make tests

Reviewers: #idi, quark, durham

Reviewed By: durham

Subscribers: mjpieters

Differential Revision: https://phabricator.intern.facebook.com/D4776651

Signature: t1:4776651:1492015012:0161c4f45eab4d3b64597d012188c5f2007e8f7d
2017-04-13 11:11:09 -07:00