Commit Graph

1066 Commits

Author SHA1 Message Date
Yuya Nishihara
dcade16cf7 encoding: factor out unicode variants of from/tolocal()
Unfortunately, these functions will be commonly used on Python 3.
2017-03-13 09:11:08 -07:00
Pierre-Yves David
76c3fb0c64 convert: don't use mutable default argument value
Caught by pylint.
2017-03-14 23:48:08 -07:00
Pierre-Yves David
db602f5625 convert: directly use repo.vfs.join
The 'repo.join' method is about to be deprecated.
2017-03-08 16:51:25 -08:00
Pierre-Yves David
f3a174150b vfs: use 'vfs' module directly in 'hgext.convert'
Now that the 'vfs' classes moved in their own module, lets use the new module
directly. We update code iteratively to help with possible bisect needs in the
future.
2017-03-02 13:32:14 +01:00
Pierre-Yves David
e5cb48ac36 vfs: replace 'scmutil.opener' usage with 'scmutil.vfs'
The 'vfs' class is the first class citizen for years. We remove all usages of
the older API. This will let us remove the old API eventually.
2017-03-02 03:52:36 +01:00
Pulkit Goyal
56031921a5 py3: convert the mode argument of os.fdopen to unicodes (2 of 2) 2017-02-13 22:15:28 +05:30
Gregory Szorc
19ccacb90b convert: remove "replacecommitter" action
As pointed out by Yuya, this action doesn't add much (any?) value.
2017-01-14 10:11:19 -08:00
Gregory Szorc
2fc8eb0c18 convert: config option to control Git committer actions
When converting a Git repository to Mercurial at Mozilla, I encountered
a scenario where I didn't want `hg convert` to automatically add the
"committer: <committer>" line to commit messages. While I can hack around
this by rewriting the Git commit before it is fed into `hg convert`,
I figured it would be a useful knob to control.

This patch introduces a config option that allows lots of control
over the committer value. I initially implemented this as a single
boolean flag to control whether to save the committer message. But
then there was feedback that it would be useful to save the committer
in extra data. While this patch doesn't implement support for saving
in extra data, it does add a mechanism for extending which actions
to take on the committer field. We should be able to easily add
actions to save in extra data.

Some of the implemented features weren't asked for. But I figured they
could be useful. If nothing else they demonstrate the extensibility
of this mechanism.
2017-01-13 23:21:10 -08:00
Gregory Szorc
57ef32586b convert: add config option to control storing original revision
common.commit.__init__ sets saverev=True by default. The side effect
of this is that the hg sink will always set the "convert_revision"
extras key to the commit being converted.

This patch adds a config option to disable this behavior.

While most consumers will want "convert_revision" to be a) written
b) with the exact Git commit that was converted, some have use cases
that prefer otherwise. In my case, I am performing significant
rewrites of a Git repository *before* it is fed into `hg convert`.
I have to do this because `hg convert` does not easily support the kind
of transform I desire, even with extensions. (For the curious, I am
"linearizing" the history of a GitHub repo by removing merge commits
which add little value to the final history. It isn't easy to do this
during `hg convert` because of Mercurial's file copy/rename metadata
requirements.)

In my scenario, my pre-convert transform stores a "convert_revision"
key in the Git commit object containing the original Git commit ID.
I want this original Git commit ID carried forward to Mercurial. By
disabling the setting of this extra during `hg convert` and copying
the value from the Git commit object, I can have the final
"convert_revision" extra key contain the original Git commit ID. An
added test verifies this exact scenario.

This feature could likely be implemented for other VCS sources. But
until someone needs the feature, I'm inclined to hold off implementing.
2016-12-22 23:28:35 -07:00
Gregory Szorc
6c0707a02c convert: add config option to copy extra keys from Git commits
Git commit objects support storing arbitrary key-value metadata. While
there is no user-facing mechanism in Git to record these values, some
tools do record data here.

Currently, `hg convert` only handles the "author," "committer," and
"parent" keys in Git commit objects. All other keys are ignored. This
means that any custom keys are lost when converting Git repos to
Mercurial.

This patch implements support for copying a whitelist of extra keys
from Git commit objects to the "extras" dict of the destination. As
the added tests demonstate, this allows extra metadata to be preserved
during the conversion process.

This patch stops short of converting all metadata to "extras." We could
potentially implement this via `convert.git.extrakeys=*` or similar.
But copying everything by default is a bit dangerous because if Git
adds new keys to commit objects, we could find ourselves copying
things that shouldn't be copied!

This patch also assumes the source key is the same as the destination
key. We could implement support for prefixing the output key to
distinguish it as coming from Git. But until this feature is needed,
I'm inclined to hold off implementing it.
2016-12-22 23:28:11 -07:00
Gregory Szorc
b79026697e convert: don't use {} as default argument value
This is a common Python gotcha. I'm kinda surprised we don't have a
check-code to detect this :/
2016-12-22 09:26:47 -08:00
Gregory Szorc
8f73471951 convert: config option for git rename limit
By default, Git applies rename and copy detection to 400 files. The
diff.renamelimit config option and -l argument to diff commands can
override this.

As part of converting some repositories in the wild, I was hitting
the default limit. Unfortunately, the warnings that Git prints in this
scenario are swallowed because the process running functionality in
common.py redirects stderr to /dev/null by default. This seems like
a bug, but a bug for another day.

This commit establishes a config option to send the rename limit
through to `git diff-tree`. The added tests demonstrate a too-low
rename limit doesn't result in copy metadata being recorded.
2016-12-18 12:53:20 -08:00
Pulkit Goyal
1f6538b90b py3: replace os.name with pycompat.osname (part 2 of 2) 2016-12-19 00:28:12 +05:30
Pulkit Goyal
9251b69fc4 py3: replace os.environ with encoding.environ (part 5 of 5) 2016-12-18 02:08:59 +05:30
David Soria Parra
34b35d6c71 convert: parse perforce data on-demand
We are using read-only attributes that parse the perforce data on
demand. We are reading the data only once whenever an attribute is
requested and use it throughout the import process. This is equivalent
to the previous behavior, but we are avoiding reading from perforce when
we initialize the object, but instead run it during the actual import
process, when the first attribute is requested (usually getheads(), see
`convertcmd.convert`).
2016-12-20 09:23:50 -08:00
David Soria Parra
74a69748a6 convert: return calculated values from parse() instead of manpulating state 2016-12-20 09:23:50 -08:00
David Soria Parra
aede854a35 convert: move localname state to function scope 2016-12-20 09:23:50 -08:00
David Soria Parra
41ea64b87e convert: use return value in parse_view() instead of manipulating state 2016-12-20 09:23:50 -08:00
Yuya Nishihara
cf9b8abe16 convert: remove unused-but-set variable introduced in d68d4752d365
Spotted by pyflakes.
2016-12-18 16:20:04 +09:00
Pulkit Goyal
908a677054 py3: replace os.sep with pycompat.ossep (part 4 of 4) 2016-12-17 20:24:46 +05:30
Yuya Nishihara
601f618d04 convert: inline strutil.rfindall()
This is the only place where strutil is used. I don't think it's worth to
keep the strutil module, so inline it.

Also, strutil.rfindall() appears to have off-by-one error. 'end = c - 1' is
wrong because 'end' is exclusive.
2016-10-16 16:58:43 +09:00
David Soria Parra
dab6f08428 convert: return commit objects for revisions in the revmap
Source revision data that exists in the revmap are ignored when pulling
data from Perforce as we consider them already imported. In case where
the `convertcmd.convert` algorithm requests a commit object for such
a revision we are creating it.  This is usually the case for parent of
the first imported revision.
2016-12-14 12:07:23 -08:00
David Soria Parra
52774a1f49 convert: encapsulate commit data fetching and commit object creation
Split fetching the `describe` form from Perforce and the commit object creation
into two functions. This allows us to reuse the commit construction for
revisions passed from a revmap.
2016-12-13 21:49:58 -08:00
David Soria Parra
8040456970 convert: do not provide head revisions if we have no changests to import
Don't set a head revision in cases where we have a revmap but no
changesets to import, as convertcmd.convert() treats them as heads of
to-imported revisions.
2016-12-13 21:49:58 -08:00
David Soria Parra
4683ce4a8a convert: allow passing in a revmap
Implement `common.setrevmap` which is used to pass in a file with existing
revision mappings. This functionality is used by `convertcmd.convert` if it
exists and allows implementors such as the p4 converter to make use of an
existing mapping.

We are using the revmap to abort scanning and the repository for more information
if we already have the revision. This means we are allowing incremental imports
in cases where a revmap is provided.
2016-12-14 01:45:57 -08:00
David Soria Parra
cde8421946 convert: use convert_revision for P4 imports
We are using convert_revisions in other importers. In order to unify this
we are also using convert_revision for Perforce in addition to the original
'p4'.
2016-12-13 21:49:58 -08:00
David Soria Parra
eeb4ddccf3 convert: remove unused dictionaries
self.parent, self.lastbranch and self.tags have never been used.
2016-12-14 01:45:17 -08:00
David Soria Parra
5b7b900ca3 convert: self.heads is a list
self.heads is used as a list throughout convert and never a dictionary.
Initialize it correctly to a list.
2016-12-14 01:43:47 -08:00
David Soria Parra
338550df2b convert: don't use long list comprehensions
We are iterating over p4changes. Make the continue condition more clear
and easier to add new conditions in future patches, by removing the list
comprehension and move the condition into the existing for-loop.
2016-12-13 21:49:58 -08:00
Pulkit Goyal
97f340e354 py3: use pycompat.getcwd() instead of os.getcwd()
We have pycompat.getcwd() which returns bytes path on Python 3. This patch
changes most of the occurences of the os.getcwd() with pycompat one.
2016-11-23 00:03:11 +05:30
Jun Wu
216e95e106 convert: migrate to util.iterfile 2016-11-14 23:17:15 +00:00
Durham Goode
52b8095f37 manifest: remove last uses of repo.manifest
Now that all the functionality has been moved to manifestlog/manifestrevlog/etc,
we can finally change all the uses of repo.manifest to use the new versions. A
future diff will then delete repo.manifest.

One additional change in this commit is to change repo.manifestlog to be a
@storecache property instead of @property. This is required by some uses of
repo.manifest require that it be settable (contrib/perf.py and the static http
server). We can't do this in a prior change because we can't use @storecache on
this until repo.manifest is no longer used anywhere.
2016-11-10 02:13:19 -08:00
Yuya Nishihara
3ea0560e2b convert: have debugsvnlog obtain standard streams from ui
This will help porting to Python 3, where sys.stdin/out/err are unfortunately
unicode streams so we can't use them directly.
2015-10-03 14:34:56 +09:00
Yuya Nishihara
3f27064dab convert: remove superfluous setbinary() calls from debugsvnlog
0f20f68c768c made standard streams set to binary mode globally.
2015-10-03 14:29:13 +09:00
Mateusz Kwapich
f97c61b781 py3: use raw strings in line continuation (convert ext)
Our py2 to py3 string translations marks those as bytestrings.
2016-10-10 05:31:31 -07:00
Augie Fackler
4e1c384d0a extensions: change magic "shipped with hg" string
I've caught multiple extensions in the wild lying about being
'internal', so it's time to move the goalposts on people. Goalpost
moving will continue until third party extensions stop trying to
defeat the system.
2016-08-23 11:26:08 -04:00
Durham Goode
769d2595c4 convert: move svn config initializer out of the module level
The svn_config_get_config config call was being called at the module level, but
had the potential to throw permission denied errors if ~/.subversion/servers was
not readable. This could happen in certain test environments where the user
permissions were very particular.

This prevented the remotenames extension from loading, since it imports
convert's hg module, which imports convert's subversion module, which calls
this. The config is only ever used from this one constructor, so let's just move
it in to there.
2016-08-01 17:38:01 -07:00
Kevin Bullock
99e7f64512 convert: update use of deprecated bzrlib property
The inventory property was deprecated in favor of root_inventory in bzr
2.5.0. Current version is 2.7.0.

I noticed this when testing locally on Python 2.6.9, which has warnings
turned on by default. The failure that occurs without this patch can be
seen on Python 2.7 by running with warnings enabled:

     $ PYTHONWARNINGS=::DeprecationWarning make 'test-convert-bzr*'
2016-07-19 11:00:32 -05:00
Pulkit Goyal
5f52c722cf py3: conditionalize cPickle import by adding in util
The cPickle is renamed to _pickle in python3 and this C extension is available
 in pickle which was not included in earlier versions. So imports are conditionalized
 to import cPickle in py2 and pickle in py3. Moreover the use of pickle in py2 is
 switched to cPickle as the C extension is faster. The hack is added in util.py and
the modules import util.pickle
2016-06-04 14:38:00 +05:30
Yuya Nishihara
a5c934df3c py3: move up symbol imports to enforce import-checker rules
Since (b) is banned, we should do the same for (a) for consistency.

 a) from mercurial import hg
    from mercurial.i18n import _

 b) from . import hg
    from .i18n import _
2016-05-14 14:03:12 +09:00
Blake Burkhart
a5bebc504a convert: pass absolute paths to git (SEC)
Fixes CVE-2016-3105 (1/1).

Previously, it was possible for the repository path passed to git-ls-remote
to be misinterpreted as a URL.

Always passing an absolute path to git is a simple way to avoid this.
2016-04-06 22:57:46 -05:00
Mads Kiilerich
274ba69e53 convert: keep converted hg parents that are outside convert.hg.revs (BC)
Before, when converting revisions without also including their already
converted parents in convert.hg.revs, the parents would no longer be parents.

That seems unfortunate and we dare to assume that nobody ever wants that.

Instead, preserve parents that are outside the current convert range but
already have been converted.

The parents returned in getcommit() are unconditionally converted, so we
introduce a separate optparents with optional parents.
2016-04-13 00:16:21 +02:00
timeless
109fcbc79e pycompat: switch to util.urlreq/util.urlerr for py3 compat 2016-04-06 23:22:12 +00:00
timeless
f77cdcd3b1 pycompat: switch to util.stringio for py3 compat 2016-04-10 20:55:37 +00:00
Julien Cristau
3e185dc2e6 convert: kill dead code
gitread is unused with the new commandline-based code.
2016-04-04 15:39:13 +02:00
Julien Cristau
dd250f1d5d convert: don't ignore errors from git diff-tree 2016-04-04 15:38:48 +02:00
Martin von Zweigbergk
a993a6ecbd convert: delete unused imports in git.py
As reported by pyflakes
2016-03-29 10:49:33 -07:00
Matt Mackall
4a96519522 merge with stable 2016-03-29 12:29:00 -05:00
Mateusz Kwapich
843a151296 convert: rewrite gitpipe to use common.commandline (SEC)
CVE-2016-3069 (4/5)
2016-03-22 17:05:11 -07:00
Mateusz Kwapich
e02db5f882 convert: dead code removal - old git calling functions (SEC)
CVE-2016-3069 (3/5)
2016-03-22 17:05:11 -07:00
Mateusz Kwapich
98120175fb convert: rewrite calls to Git to use the new shelling mechanism (SEC)
CVE-2016-3069 (2/5)

One test output changed because we were ignoring git return code in numcommits
before.
2016-03-22 17:05:11 -07:00
Mateusz Kwapich
8103d3a484 convert: add new, non-clowny interface for shelling out to git (SEC)
CVE-2016-3069 (1/5)

To avoid shell injection and for the sake of simplicity let's use the
common.commandline for calling git.
2016-03-22 17:05:11 -07:00
FUJIWARA Katsunori
1d2fb0ab99 hgext: use templatekeyword to mark a function as template keyword
This patch replaces registration of template keyword function in
bundled extensions by registrar.templatekeyword decorator all at once.
2016-03-13 05:17:06 +09:00
timeless
4f987db82a convert: bzr use absolute_import 2016-03-02 16:32:52 +00:00
Anton Shestakov
67f0dab5d3 convert: specify unit for ui.progress when scanning paths 2016-03-11 22:30:04 +08:00
Anton Shestakov
5b22fe48fb convert: specify unit for ui.progress when operating on files 2016-03-11 22:29:51 +08:00
FUJIWARA Katsunori
08147e9705 convert: fix "stdlib import follows local import" problem in transport
Before this patch, import-checker reports error below for importing
subversion python binding libraries.

    stdlib import "svn.*" follows local import: mercurial
2016-03-11 21:55:44 +09:00
FUJIWARA Katsunori
8afa478806 convert: fix relative import of stdlib module in subversion
Before this patch, import-checker reports "relative import of stdlib
module" error for importing Pool and SubversionException from svn.core
in subversion.py.

To fix this relative import of stdlib module, this patch adds prefix
'svn.core.' to Pool and SubversionException in source.

These 'svn.core.' relative accessing shouldn't cause performance
impact, because there are much more code paths accessing to
'svn.core.' relative properties.

BTW, in transport.py, this error is avoided by assignment below.

    SubversionException = svn.core.SubversionException

But this can't be used in subversion.py case, because:

  - such assignment in indented code block causes "don't use camelcase
    in identifiers" error of check-code.py

  - but it should be placed in indented block, because svn is None at
    failure of importing subversion python binding libraries (=
    examination of 'svn' is needed)
2016-03-11 21:55:44 +09:00
FUJIWARA Katsunori
b75e95ce01 convert: make subversion import transport locally
Introducing "absolute_import" feature in previous patch caused failure
of importing local module "transport".
2016-03-11 21:55:44 +09:00
timeless
0d9e787fe4 convert: __init__ use absolute_import 2016-03-02 16:34:43 +00:00
timeless
798be7b824 convert: cvs use absolute_import 2016-03-02 16:41:35 +00:00
timeless
2ac9193926 convert: transport use absolute_import 2016-03-02 16:37:50 +00:00
timeless
4bcfa639ac convert: bzr use absolute_import 2016-03-02 16:32:52 +00:00
timeless
d982b75f31 convert: common use absolute_import 2016-03-02 16:26:35 +00:00
timeless
d9aed31d36 convert: convcmd use absolute_import 2016-03-02 16:23:28 +00:00
timeless
d35bed319b convert: subversion use absolute_import 2016-03-02 16:13:05 +00:00
Bryan O'Sullivan
4a70f875ef with: use context manager for transaction in mercurial_sink 2016-01-15 13:14:47 -08:00
Martin von Zweigbergk
ee22e78a3b convert: use manifest.diff() instead of ctx.status()
mercurial_source.getchanges() seems to care about files whose nodeid
has changed even if their contents has not (i.e. it has been
reverted/backed out). The method uses ctx1.status(ctx2) to find
differencing files. However, that method is currently broken and
reports reverted changes as modified. In order to fix that method, we
first need to rewrite getchanges() using manifest.diff(), which does
report reverted files as modified (because it's about differences in
the manifest, so about nodeids).
2016-01-09 22:58:10 -08:00
Martin von Zweigbergk
933fd7efd0 convert: replace cache of (m,a,r) by (ma,r)
The next commit will rewrite the way we find changes between two
manifests. By making the cache not care about the difference between
added and modified files, we don't require the rewritten code to care
about that difference either. Also extract the call to ctx.status() to
simplify the next commit.
2016-01-10 21:07:34 -08:00
Martin von Zweigbergk
16f9affccf convert: use _ prefix for private methods in hg sink
This makes it clearer which methods are part of the interface defined
by the superclass.
2016-01-09 21:42:48 -08:00
Augie Fackler
a84cc516e7 merge: restate calculateupdates in terms of a matcher
Once we get a matcher down into manifestmerge, we can make narrowhg
work more easily and potentially let manifest.match().diff() do less
work in manifestmerge.
2015-12-14 20:37:41 -05:00
timeless
15faf26352 convert/svn: quiet check-config 2015-12-08 08:37:12 +00:00
Laurent Charignon
ffb03a1de7 convert: use repo._bookmarks.recordchange instead of repo._bookmarks.write
Before this patch, convert was using repo._bookmarks.write, a deprecated API
for saving bookmarks.
This patch changes the use of repo._bookmarks.write to
repo._bookmarks.recordchange.
2015-11-16 17:15:36 -08:00
Laurent Charignon
581136b55b convert: indentation change to make the next patch more legible
We put the code to be indented in the next patch in a "if True:" block to make
it easier to review.
2015-11-16 17:14:15 -08:00
timeless
6a754bf667 convert: monotone use absolute_import 2016-03-02 15:50:34 +00:00
timeless
d22428de9f convert: p4 use absolute_import 2016-03-02 15:31:15 +00:00
timeless
9b1f5043ce convert: hg use absolute_import 2016-03-02 15:26:49 +00:00
timeless
de2d02e8b7 convert: cvsps use absolute_import 2016-03-02 14:56:29 +00:00
timeless
96ac5cd883 convert: darcs use absolute_import 2016-03-02 14:23:23 +00:00
timeless
86b7c93478 convert: filemap use absolute_import 2016-03-02 09:00:58 +00:00
timeless
81c64e649e convert: gnuarch use absolute_import 2016-03-02 08:58:01 +00:00
timeless
65e468cce9 convert: git use absolute_import 2016-03-02 20:42:13 +00:00
Mads Kiilerich
84f1b9864b convert: fix Python syntax in 'splice in' message
Instead of reporting
  spliced in ['82544090e14fe18091e04f1fb0f0d7991cbe6e7e'] as parents of 369fd983d9e13330e9f12d9fce820deae84ea223
report
  spliced in 82544090e14fe18091e04f1fb0f0d7991cbe6e7e as parents of 369fd983d9e13330e9f12d9fce820deae84ea223
2015-10-19 16:49:54 +02:00
timeless@mozdev.org
1b8a3b4ea1 grammar: use does instead of do where appropriate 2015-10-14 02:06:54 -04:00
Emanuele Giaquinta
18869274ee cvsps: fix computation of parent revisions when log caching is on
cvsps computes the parent revisions of log entries by walking the cvs log
sorted by (rcs, revision) and by iteratively maintaining a 'versions'
dictionary which maps a (rcs, branch) pair onto the last revision seen for that
pair. When log caching is on and a log cache exists, cvsps fails to set the
parent revisions of new log entries because it does not iterate over the log
cache in the parents computation. A complication is that a file rcs can change
(move to/from the attic), with respect to its value in the log cache, if the
file is removed/added back. This patch adds an iteration over the log cache to
update the rcs of cached log entries, if changed, and to properly populate the
'versions' dictionary.
2015-10-07 11:33:52 +03:00
Pierre-Yves David
30913031d4 error: get Abort from 'error' instead of 'util'
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.

For great justice.
2015-10-08 12:55:45 -07:00
Yuya Nishihara
dac7167e8f help: include parens in DEPRECATED/EXPERIMENTAL keywords
In some languages that have no caps, "DEPRECATED" and "deprecated" can be
translated to the same byte sequence. So it is too wild to exclude messages
by _("DEPRECATED").
2015-09-26 11:38:39 +09:00
Durham Goode
a3e74a7ca5 convert: remove restriction on multiple --rev in hg source
Multiple --rev args on convert is a new feature, and was initially disabled for
all sources. It has since been enabled on git sources, and this patch enables it
on mercurial sources.
2015-09-03 10:29:42 -07:00
Durham Goode
aa9d278287 convert: fix syncing deletes from p2 merge commit
Recently we fixed converting merges to correctly sync changes from p2. We missed
the case of deletes though (so p2 deleted a file that p1 had not yet deleted,
and the file does not belong to the source).

The fix is to detect when p2 doesn't have the file, so we just sync it as a
delete to p1 in the merge.

Updated the test, and verified it failed before the fix.
2015-08-25 15:54:33 -07:00
Durham Goode
d57c6ca233 convert: add convert.git.skipsubmodules option
This adds an option to not pull in gitsubmodules during a convert. This is
useful when converting large git repositories where gitsubmodules were allowed
historically, but are no longer wanted.
2015-08-24 22:16:01 -07:00
Durham Goode
e41971b2a5 convert: fix convert dropping p2 contents during filemap merge
When converting a merge commit using a filemap convert (i.e. when moving
contents from the root of the repo into subdir1/), convert would silently drop
the entire contents of the target repo's p2. This was because when it built the
target commit, it did so by taking the target p1 and adding only the files that
changed in the source repo's merge commit.

This breaks in the case where the target repo has files that are unrelated to
the source repo (like in the case where you use convert to import a repo as a
subdirectory of another).

The fix is to use Mercurial's merge logic to detect which files in p2 we should
carry over to the merge. It follows three rules:

1) if the file belongs to the source, don't try to merge it. Rely on the list of
files provided to putcommit to be correct.

2) if the file requires merging or user input (change vs deleted), throw an
exception. We don't have enough info to do this.

3) if p2 has the newest, non-merge-requiring version of the file, take it

I've also added a test to cover this issue.
2015-08-14 15:22:47 -07:00
Durham Goode
447360b0c9 convert: implements targetfilebelongstosource for filemap source
This is an implementation of the new targetfilebelongstosource() function for
the filemapper. It simply checks if the given file name is prefixed by any of
the rename destinations.

It is not a perfect implementation since it doesn't account for the filemap
specifying includes or excludes, but that makes the problem much harder, and
this implementation should suffice for most cases.
2015-08-15 13:46:30 -07:00
Durham Goode
a17be08d31 convert: add function to test if file is from source
This adds a base implementation of a function that tests if a given file from a
target repo came from the source repo. This will be used later to detect which
files did not come from the source repo during a merge, so we can merge those
files correctly instead of dropping them.
2015-08-15 13:44:55 -07:00
Matt Mackall
6c04738a65 merge with stable 2015-08-10 15:30:28 -05:00
Durham Goode
8f62a68933 convert: fix git copy file content conversions
There was a bug in the git convert code where if you copied a file and modified
the copy source in the same commit, and if the copy dest was alphabetically
earlier than the copy source, the converted version would use the copy dest
contents for both the source and the target.

The root of the bug is that the git diff-tree output is formatted like so:

:<mode> <mode> <oldhash>    <newhash>    <state> <src>   <dest>
:100644 100644 c1ab79a15... 3dfc779ab... C069    oldname newname
:100644 100644 c1ab79a15... 03e2188a6... M       oldname

The old code would always take the 'oldname' field as the name of the file being
processed, then it would try to do an extra convert for the newname. This works
for renames because it does a delete for the oldname and a create for the
newname.

For copies though, it ends up associating the copied content (3dfc779ab above)
with the oldname. It only happened when the dest was alphabetically before
because that meant the copy got processed before the modification.

The fix is the treat copy lines as affecting only the newname, and not marking
the oldname as processed.
2015-08-06 17:21:46 -07:00
Eugene Baranov
322848b810 convert: when converting from Perforce use original local encoding by default
On Windows Perforce command line client uses default system locale to encode
output. Using 'latin_1' causes locale-specific characters to be replaced with
question marks. With this patch we will use default locale by default whilst
allowing to specify it explicity with 'convert.p4.encoding' config option.

This is a potentially breaking change for any scripts relying on output treated
as in 'latin_1' encoding.

Also because hgext.convert.convcmd overwrites detected default system locale
with UTF-8 we had to introduce an import cycle in hgext.convert.p4 to retrieve
originally detected encoding from hgext.convert.convcmd.
2015-07-22 16:57:11 +01:00
Eugene Baranov
29b28e8371 convert: when getting file from Perforce concatenate data at the end
As it turned out, even when getting relatively small files, concatenating
string data every time when new chunk is received is very inefficient.
Maintaining a string list of data chunks and concatenating everything in one go
at the end seems much more efficient - in my testing it made getting 40 MB file
7 times faster, whilst converting of a particularly big changelist with some big
files went down from 20 hours to 3 hours.
2015-07-30 00:58:05 +01:00
Matt Harbison
5dece5b905 convert: document convert.hg.startrev 2015-07-29 21:31:56 -04:00
Durham Goode
4126c0d435 convert: fix git convert using servers branches
The conversion from git to hg was reading the remote branch list directly from
the origin server. If the origin's branch had moved forward since the last git
fetch, it would return a git hash which didn't exist locally, and therefore the
branch was not converted.

This changes it to rely on the local repo's refs/remotes list of branches
instead, so it's completely cut off from the server.
2015-07-29 13:21:03 -07:00
Eugene Baranov
afafa68827 convert: use 'default' for specifying branch name in branchmap (issue4753)
A fix for issue2653 with f5abbf51a76e introduced a discrepancy how default
branch should be denoted when converting with branchmap from different SCM.
E.g. for Git and Mercurial you need to use 'default' whilst for Perforce and
SVN you had to use 'None'. This changeset unifies 'default' for such purposes
whilst falling back to 'None' when no 'default' mapping specified.
2015-07-14 14:40:56 +01:00