Commit Graph

1025 Commits

Author SHA1 Message Date
Augie Fackler
e2774d9258 python3: wrap all uses of <exception>.strerror with strtolocal
Our string literals are bytes, and we mostly want to %-format a
strerror into a one of those literals, so this fixes a ton of issues.
2017-08-22 20:03:07 -04:00
Boris Feld
316d61deba bookmark: use 'applychanges' in the convert extension 2017-07-10 17:30:20 +02:00
Boris Feld
1d64a7418f convert: use the new 'phase.registernew' function 2017-07-11 00:59:23 +02:00
FUJIWARA Katsunori
cc2b59230c convert: transcode CVS log messages by specified encoding (issue5597)
Converting from CVS to Mercurial assumes that CVS log messages in "cvs
rlog" output are encoded in UTF-8 (or basic Latin-1). But cvs itself
is usually unaware of encoding of log messages, in practice.

Therefore, if there are commits, of which log message is encoded in
other than UTF-8, log message of corresponded revisions in the
converted repository will be broken.

To avoid such broken log messages, this patch transcodes CVS log
messages by encoding specified via "convert.cvsps.logencoding"
configuration.

This patch accepts multiple encoding for convenience, because
"multiple encoding mixed in a repository" easily occurs. For example,
UTF-8 (recent POSIX), cp932 (Windows), and EUC-JP (legacy POSIX) are
well known encoding for Japanese.
2017-07-11 02:10:04 +09:00
Augie Fackler
48687498f9 convert: remove if False block
This code has never run since its introduction on July 18th,
2007. It's time for it to go.

Differential Revision: https://phab.mercurial-scm.org/D19
2017-07-07 15:08:23 -04:00
Matt Harbison
5eb7c3a833 convert: correct the documentation about whitespace in branchmap branches
Might as well let the users know they can get rid of branch names with spaces.
2017-06-10 02:20:14 -04:00
Yuya Nishihara
3e663dde68 registrar: move cmdutil.command to registrar module (API)
cmdutil.command wasn't a member of the registrar framework only for a
historical reason. Let's make that happen. This patch keeps cmdutil.command
as an alias for extension compatibility.
2016-01-09 23:07:20 +09:00
Martin von Zweigbergk
c3406ac3db cleanup: use set literals
We no longer support Python 2.6, so we can now use set literals.
2017-02-10 16:56:29 -08:00
David Soria Parra
92f2b90f4d convert: fix the handling of empty changlist descriptions in P4
Empty changelist descriptions are valid in Perforce. If we encounter one of
them we are currently running into an IndexError. In case of empty commit
messages set the commit message to **empty changelist description**, which
follows Perforce terminology.
2017-03-22 14:12:58 -05:00
Yuya Nishihara
dcade16cf7 encoding: factor out unicode variants of from/tolocal()
Unfortunately, these functions will be commonly used on Python 3.
2017-03-13 09:11:08 -07:00
Pierre-Yves David
76c3fb0c64 convert: don't use mutable default argument value
Caught by pylint.
2017-03-14 23:48:08 -07:00
Pierre-Yves David
db602f5625 convert: directly use repo.vfs.join
The 'repo.join' method is about to be deprecated.
2017-03-08 16:51:25 -08:00
Pierre-Yves David
f3a174150b vfs: use 'vfs' module directly in 'hgext.convert'
Now that the 'vfs' classes moved in their own module, lets use the new module
directly. We update code iteratively to help with possible bisect needs in the
future.
2017-03-02 13:32:14 +01:00
Pierre-Yves David
e5cb48ac36 vfs: replace 'scmutil.opener' usage with 'scmutil.vfs'
The 'vfs' class is the first class citizen for years. We remove all usages of
the older API. This will let us remove the old API eventually.
2017-03-02 03:52:36 +01:00
Pulkit Goyal
56031921a5 py3: convert the mode argument of os.fdopen to unicodes (2 of 2) 2017-02-13 22:15:28 +05:30
Gregory Szorc
19ccacb90b convert: remove "replacecommitter" action
As pointed out by Yuya, this action doesn't add much (any?) value.
2017-01-14 10:11:19 -08:00
Gregory Szorc
2fc8eb0c18 convert: config option to control Git committer actions
When converting a Git repository to Mercurial at Mozilla, I encountered
a scenario where I didn't want `hg convert` to automatically add the
"committer: <committer>" line to commit messages. While I can hack around
this by rewriting the Git commit before it is fed into `hg convert`,
I figured it would be a useful knob to control.

This patch introduces a config option that allows lots of control
over the committer value. I initially implemented this as a single
boolean flag to control whether to save the committer message. But
then there was feedback that it would be useful to save the committer
in extra data. While this patch doesn't implement support for saving
in extra data, it does add a mechanism for extending which actions
to take on the committer field. We should be able to easily add
actions to save in extra data.

Some of the implemented features weren't asked for. But I figured they
could be useful. If nothing else they demonstrate the extensibility
of this mechanism.
2017-01-13 23:21:10 -08:00
Gregory Szorc
57ef32586b convert: add config option to control storing original revision
common.commit.__init__ sets saverev=True by default. The side effect
of this is that the hg sink will always set the "convert_revision"
extras key to the commit being converted.

This patch adds a config option to disable this behavior.

While most consumers will want "convert_revision" to be a) written
b) with the exact Git commit that was converted, some have use cases
that prefer otherwise. In my case, I am performing significant
rewrites of a Git repository *before* it is fed into `hg convert`.
I have to do this because `hg convert` does not easily support the kind
of transform I desire, even with extensions. (For the curious, I am
"linearizing" the history of a GitHub repo by removing merge commits
which add little value to the final history. It isn't easy to do this
during `hg convert` because of Mercurial's file copy/rename metadata
requirements.)

In my scenario, my pre-convert transform stores a "convert_revision"
key in the Git commit object containing the original Git commit ID.
I want this original Git commit ID carried forward to Mercurial. By
disabling the setting of this extra during `hg convert` and copying
the value from the Git commit object, I can have the final
"convert_revision" extra key contain the original Git commit ID. An
added test verifies this exact scenario.

This feature could likely be implemented for other VCS sources. But
until someone needs the feature, I'm inclined to hold off implementing.
2016-12-22 23:28:35 -07:00
Gregory Szorc
6c0707a02c convert: add config option to copy extra keys from Git commits
Git commit objects support storing arbitrary key-value metadata. While
there is no user-facing mechanism in Git to record these values, some
tools do record data here.

Currently, `hg convert` only handles the "author," "committer," and
"parent" keys in Git commit objects. All other keys are ignored. This
means that any custom keys are lost when converting Git repos to
Mercurial.

This patch implements support for copying a whitelist of extra keys
from Git commit objects to the "extras" dict of the destination. As
the added tests demonstate, this allows extra metadata to be preserved
during the conversion process.

This patch stops short of converting all metadata to "extras." We could
potentially implement this via `convert.git.extrakeys=*` or similar.
But copying everything by default is a bit dangerous because if Git
adds new keys to commit objects, we could find ourselves copying
things that shouldn't be copied!

This patch also assumes the source key is the same as the destination
key. We could implement support for prefixing the output key to
distinguish it as coming from Git. But until this feature is needed,
I'm inclined to hold off implementing it.
2016-12-22 23:28:11 -07:00
Gregory Szorc
b79026697e convert: don't use {} as default argument value
This is a common Python gotcha. I'm kinda surprised we don't have a
check-code to detect this :/
2016-12-22 09:26:47 -08:00
Gregory Szorc
8f73471951 convert: config option for git rename limit
By default, Git applies rename and copy detection to 400 files. The
diff.renamelimit config option and -l argument to diff commands can
override this.

As part of converting some repositories in the wild, I was hitting
the default limit. Unfortunately, the warnings that Git prints in this
scenario are swallowed because the process running functionality in
common.py redirects stderr to /dev/null by default. This seems like
a bug, but a bug for another day.

This commit establishes a config option to send the rename limit
through to `git diff-tree`. The added tests demonstrate a too-low
rename limit doesn't result in copy metadata being recorded.
2016-12-18 12:53:20 -08:00
Pulkit Goyal
1f6538b90b py3: replace os.name with pycompat.osname (part 2 of 2) 2016-12-19 00:28:12 +05:30
Pulkit Goyal
9251b69fc4 py3: replace os.environ with encoding.environ (part 5 of 5) 2016-12-18 02:08:59 +05:30
David Soria Parra
34b35d6c71 convert: parse perforce data on-demand
We are using read-only attributes that parse the perforce data on
demand. We are reading the data only once whenever an attribute is
requested and use it throughout the import process. This is equivalent
to the previous behavior, but we are avoiding reading from perforce when
we initialize the object, but instead run it during the actual import
process, when the first attribute is requested (usually getheads(), see
`convertcmd.convert`).
2016-12-20 09:23:50 -08:00
David Soria Parra
74a69748a6 convert: return calculated values from parse() instead of manpulating state 2016-12-20 09:23:50 -08:00
David Soria Parra
aede854a35 convert: move localname state to function scope 2016-12-20 09:23:50 -08:00
David Soria Parra
41ea64b87e convert: use return value in parse_view() instead of manipulating state 2016-12-20 09:23:50 -08:00
Yuya Nishihara
cf9b8abe16 convert: remove unused-but-set variable introduced in d68d4752d365
Spotted by pyflakes.
2016-12-18 16:20:04 +09:00
Pulkit Goyal
908a677054 py3: replace os.sep with pycompat.ossep (part 4 of 4) 2016-12-17 20:24:46 +05:30
Yuya Nishihara
601f618d04 convert: inline strutil.rfindall()
This is the only place where strutil is used. I don't think it's worth to
keep the strutil module, so inline it.

Also, strutil.rfindall() appears to have off-by-one error. 'end = c - 1' is
wrong because 'end' is exclusive.
2016-10-16 16:58:43 +09:00
David Soria Parra
dab6f08428 convert: return commit objects for revisions in the revmap
Source revision data that exists in the revmap are ignored when pulling
data from Perforce as we consider them already imported. In case where
the `convertcmd.convert` algorithm requests a commit object for such
a revision we are creating it.  This is usually the case for parent of
the first imported revision.
2016-12-14 12:07:23 -08:00
David Soria Parra
52774a1f49 convert: encapsulate commit data fetching and commit object creation
Split fetching the `describe` form from Perforce and the commit object creation
into two functions. This allows us to reuse the commit construction for
revisions passed from a revmap.
2016-12-13 21:49:58 -08:00
David Soria Parra
8040456970 convert: do not provide head revisions if we have no changests to import
Don't set a head revision in cases where we have a revmap but no
changesets to import, as convertcmd.convert() treats them as heads of
to-imported revisions.
2016-12-13 21:49:58 -08:00
David Soria Parra
4683ce4a8a convert: allow passing in a revmap
Implement `common.setrevmap` which is used to pass in a file with existing
revision mappings. This functionality is used by `convertcmd.convert` if it
exists and allows implementors such as the p4 converter to make use of an
existing mapping.

We are using the revmap to abort scanning and the repository for more information
if we already have the revision. This means we are allowing incremental imports
in cases where a revmap is provided.
2016-12-14 01:45:57 -08:00
David Soria Parra
cde8421946 convert: use convert_revision for P4 imports
We are using convert_revisions in other importers. In order to unify this
we are also using convert_revision for Perforce in addition to the original
'p4'.
2016-12-13 21:49:58 -08:00
David Soria Parra
eeb4ddccf3 convert: remove unused dictionaries
self.parent, self.lastbranch and self.tags have never been used.
2016-12-14 01:45:17 -08:00
David Soria Parra
5b7b900ca3 convert: self.heads is a list
self.heads is used as a list throughout convert and never a dictionary.
Initialize it correctly to a list.
2016-12-14 01:43:47 -08:00
David Soria Parra
338550df2b convert: don't use long list comprehensions
We are iterating over p4changes. Make the continue condition more clear
and easier to add new conditions in future patches, by removing the list
comprehension and move the condition into the existing for-loop.
2016-12-13 21:49:58 -08:00
Pulkit Goyal
97f340e354 py3: use pycompat.getcwd() instead of os.getcwd()
We have pycompat.getcwd() which returns bytes path on Python 3. This patch
changes most of the occurences of the os.getcwd() with pycompat one.
2016-11-23 00:03:11 +05:30
Jun Wu
216e95e106 convert: migrate to util.iterfile 2016-11-14 23:17:15 +00:00
Durham Goode
52b8095f37 manifest: remove last uses of repo.manifest
Now that all the functionality has been moved to manifestlog/manifestrevlog/etc,
we can finally change all the uses of repo.manifest to use the new versions. A
future diff will then delete repo.manifest.

One additional change in this commit is to change repo.manifestlog to be a
@storecache property instead of @property. This is required by some uses of
repo.manifest require that it be settable (contrib/perf.py and the static http
server). We can't do this in a prior change because we can't use @storecache on
this until repo.manifest is no longer used anywhere.
2016-11-10 02:13:19 -08:00
Yuya Nishihara
3ea0560e2b convert: have debugsvnlog obtain standard streams from ui
This will help porting to Python 3, where sys.stdin/out/err are unfortunately
unicode streams so we can't use them directly.
2015-10-03 14:34:56 +09:00
Yuya Nishihara
3f27064dab convert: remove superfluous setbinary() calls from debugsvnlog
0f20f68c768c made standard streams set to binary mode globally.
2015-10-03 14:29:13 +09:00
Mateusz Kwapich
f97c61b781 py3: use raw strings in line continuation (convert ext)
Our py2 to py3 string translations marks those as bytestrings.
2016-10-10 05:31:31 -07:00
Augie Fackler
4e1c384d0a extensions: change magic "shipped with hg" string
I've caught multiple extensions in the wild lying about being
'internal', so it's time to move the goalposts on people. Goalpost
moving will continue until third party extensions stop trying to
defeat the system.
2016-08-23 11:26:08 -04:00
Durham Goode
769d2595c4 convert: move svn config initializer out of the module level
The svn_config_get_config config call was being called at the module level, but
had the potential to throw permission denied errors if ~/.subversion/servers was
not readable. This could happen in certain test environments where the user
permissions were very particular.

This prevented the remotenames extension from loading, since it imports
convert's hg module, which imports convert's subversion module, which calls
this. The config is only ever used from this one constructor, so let's just move
it in to there.
2016-08-01 17:38:01 -07:00
Kevin Bullock
99e7f64512 convert: update use of deprecated bzrlib property
The inventory property was deprecated in favor of root_inventory in bzr
2.5.0. Current version is 2.7.0.

I noticed this when testing locally on Python 2.6.9, which has warnings
turned on by default. The failure that occurs without this patch can be
seen on Python 2.7 by running with warnings enabled:

     $ PYTHONWARNINGS=::DeprecationWarning make 'test-convert-bzr*'
2016-07-19 11:00:32 -05:00
Pulkit Goyal
5f52c722cf py3: conditionalize cPickle import by adding in util
The cPickle is renamed to _pickle in python3 and this C extension is available
 in pickle which was not included in earlier versions. So imports are conditionalized
 to import cPickle in py2 and pickle in py3. Moreover the use of pickle in py2 is
 switched to cPickle as the C extension is faster. The hack is added in util.py and
the modules import util.pickle
2016-06-04 14:38:00 +05:30
Yuya Nishihara
a5c934df3c py3: move up symbol imports to enforce import-checker rules
Since (b) is banned, we should do the same for (a) for consistency.

 a) from mercurial import hg
    from mercurial.i18n import _

 b) from . import hg
    from .i18n import _
2016-05-14 14:03:12 +09:00
Blake Burkhart
a5bebc504a convert: pass absolute paths to git (SEC)
Fixes CVE-2016-3105 (1/1).

Previously, it was possible for the repository path passed to git-ls-remote
to be misinterpreted as a URL.

Always passing an absolute path to git is a simple way to avoid this.
2016-04-06 22:57:46 -05:00