Commit Graph

59 Commits

Author SHA1 Message Date
Gregory Szorc
19ccacb90b convert: remove "replacecommitter" action
As pointed out by Yuya, this action doesn't add much (any?) value.
2017-01-14 10:11:19 -08:00
Gregory Szorc
2fc8eb0c18 convert: config option to control Git committer actions
When converting a Git repository to Mercurial at Mozilla, I encountered
a scenario where I didn't want `hg convert` to automatically add the
"committer: <committer>" line to commit messages. While I can hack around
this by rewriting the Git commit before it is fed into `hg convert`,
I figured it would be a useful knob to control.

This patch introduces a config option that allows lots of control
over the committer value. I initially implemented this as a single
boolean flag to control whether to save the committer message. But
then there was feedback that it would be useful to save the committer
in extra data. While this patch doesn't implement support for saving
in extra data, it does add a mechanism for extending which actions
to take on the committer field. We should be able to easily add
actions to save in extra data.

Some of the implemented features weren't asked for. But I figured they
could be useful. If nothing else they demonstrate the extensibility
of this mechanism.
2017-01-13 23:21:10 -08:00
Gregory Szorc
57ef32586b convert: add config option to control storing original revision
common.commit.__init__ sets saverev=True by default. The side effect
of this is that the hg sink will always set the "convert_revision"
extras key to the commit being converted.

This patch adds a config option to disable this behavior.

While most consumers will want "convert_revision" to be a) written
b) with the exact Git commit that was converted, some have use cases
that prefer otherwise. In my case, I am performing significant
rewrites of a Git repository *before* it is fed into `hg convert`.
I have to do this because `hg convert` does not easily support the kind
of transform I desire, even with extensions. (For the curious, I am
"linearizing" the history of a GitHub repo by removing merge commits
which add little value to the final history. It isn't easy to do this
during `hg convert` because of Mercurial's file copy/rename metadata
requirements.)

In my scenario, my pre-convert transform stores a "convert_revision"
key in the Git commit object containing the original Git commit ID.
I want this original Git commit ID carried forward to Mercurial. By
disabling the setting of this extra during `hg convert` and copying
the value from the Git commit object, I can have the final
"convert_revision" extra key contain the original Git commit ID. An
added test verifies this exact scenario.

This feature could likely be implemented for other VCS sources. But
until someone needs the feature, I'm inclined to hold off implementing.
2016-12-22 23:28:35 -07:00
Gregory Szorc
6c0707a02c convert: add config option to copy extra keys from Git commits
Git commit objects support storing arbitrary key-value metadata. While
there is no user-facing mechanism in Git to record these values, some
tools do record data here.

Currently, `hg convert` only handles the "author," "committer," and
"parent" keys in Git commit objects. All other keys are ignored. This
means that any custom keys are lost when converting Git repos to
Mercurial.

This patch implements support for copying a whitelist of extra keys
from Git commit objects to the "extras" dict of the destination. As
the added tests demonstate, this allows extra metadata to be preserved
during the conversion process.

This patch stops short of converting all metadata to "extras." We could
potentially implement this via `convert.git.extrakeys=*` or similar.
But copying everything by default is a bit dangerous because if Git
adds new keys to commit objects, we could find ourselves copying
things that shouldn't be copied!

This patch also assumes the source key is the same as the destination
key. We could implement support for prefixing the output key to
distinguish it as coming from Git. But until this feature is needed,
I'm inclined to hold off implementing it.
2016-12-22 23:28:11 -07:00
Gregory Szorc
8f73471951 convert: config option for git rename limit
By default, Git applies rename and copy detection to 400 files. The
diff.renamelimit config option and -l argument to diff commands can
override this.

As part of converting some repositories in the wild, I was hitting
the default limit. Unfortunately, the warnings that Git prints in this
scenario are swallowed because the process running functionality in
common.py redirects stderr to /dev/null by default. This seems like
a bug, but a bug for another day.

This commit establishes a config option to send the rename limit
through to `git diff-tree`. The added tests demonstrate a too-low
rename limit doesn't result in copy metadata being recorded.
2016-12-18 12:53:20 -08:00
Matt Harbison
5261ae4879 tests: add globs for Windows 2016-05-05 21:14:12 -04:00
Blake Burkhart
a5bebc504a convert: pass absolute paths to git (SEC)
Fixes CVE-2016-3105 (1/1).

Previously, it was possible for the repository path passed to git-ls-remote
to be misinterpreted as a URL.

Always passing an absolute path to git is a simple way to avoid this.
2016-04-06 22:57:46 -05:00
timeless
45b895c57d minirst: change hgrole to use single quotes
We decided to reserve double quotes for arguments to hg because cmd
does not like single quotes, so switch the outer quotes to single
2016-01-12 06:03:36 +00:00
Durham Goode
d57c6ca233 convert: add convert.git.skipsubmodules option
This adds an option to not pull in gitsubmodules during a convert. This is
useful when converting large git repositories where gitsubmodules were allowed
historically, but are no longer wanted.
2015-08-24 22:16:01 -07:00
Eugene Baranov
322848b810 convert: when converting from Perforce use original local encoding by default
On Windows Perforce command line client uses default system locale to encode
output. Using 'latin_1' causes locale-specific characters to be replaced with
question marks. With this patch we will use default locale by default whilst
allowing to specify it explicity with 'convert.p4.encoding' config option.

This is a potentially breaking change for any scripts relying on output treated
as in 'latin_1' encoding.

Also because hgext.convert.convcmd overwrites detected default system locale
with UTF-8 we had to introduce an import cycle in hgext.convert.p4 to retrieve
originally detected encoding from hgext.convert.convcmd.
2015-07-22 16:57:11 +01:00
Matt Harbison
5dece5b905 convert: document convert.hg.startrev 2015-07-29 21:31:56 -04:00
Durham Goode
ecce172bba convert: allow customizing git remote prefix
Previously all git remotes were created as "remote/foo". This patch adds a
configuration option for deciding what the prefix should be. This is useful if
you want the bookmarks to be "origin/foo" like they are in git, or if you're
integrating with the remotenames extension and don't want the local remote/foo
bookmarks to overlap with the remote foo bookmarks.
2015-07-13 21:37:46 -07:00
Durham Goode
9f87b2f455 convert: add config for recording the source name
This creates the convert.hg.sourcename config option which will embed a user
defined name into each commit created by the convert. This is useful when using
the convert extension to merge several repositories together and we want to
record where each commit came from.
2015-07-08 10:31:09 -07:00
Durham Goode
0f795691b8 convert: add support for specifying multiple revs
Previously convert could only take one '--rev'. This change allows the user to
specify multiple --rev entries. For instance, this could allow converting
multiple branches (but not all branches) at once from git.

In this first patch, we disable support for this for all sources.  Future
patches will enable it for select sources (like git).
2015-07-08 10:27:43 -07:00
Durham Goode
d70241d991 convert: add config to not convert tags
In some cases we do not want to convert tags from the source repo to be tags in
the target repo (for instance, in a large repository, hgtags cause scaling
issues so we want to avoid them). This adds a config option to disable
converting tags.
2015-06-29 13:40:20 -07:00
Matt Harbison
ff48d6b214 convert: support incremental conversion with hg subrepos
This was implied in issue3486, which specifically asked for subrepo support in
lfconvert.  Now that lfconvert uses the convert extension internally when going
to normal files, the issue is half fixed.  But now even non largefile repos
benefit when other transformations are needed.

Supporting a full subrepo tree conversion from a single command doesn't seem
reasonable, given the number of options that can be provided, and the
transformations that would need to occur when entering a subrepo (consider
'filemap' paths).  Instead, this allows the user to incrementally convert each
hg subrepo from bottom up like so:

  # so convert knows the dest type when it sees a non empty dest dir
  $ hg init converted

  $ hg convert orig/sub1 converted/sub1
  $ hg convert orig/sub2 converted/sub2
  $ hg convert orig converted

This allows different options to be applied to different subrepos more readily.
It assumes the shamap is in the default location in each converted subrepo for
simplicity.  It also allows for a subrepo to be cloned into place, in case _it_
doesn't need a conversion.  I was able to convert away from using
largefiles/bfiles in several subrepos with this mechanism.
2015-05-29 13:25:34 -04:00
Siddharth Agarwal
f149aa54b6 convert: change default for git rename detection to 50%
This default mirrors the default for 'git diff'. Other commands have slightly
different defaults -- for example, the move/copy detection for 'git blame'
assumes that a hunk is moved if more than 40 alphanumeric characters are the
same, or copied if more than 20 alphanumeric characters are the same. 50% seems
to be the most common default, though.
2014-09-23 14:45:23 -07:00
Siddharth Agarwal
c1d6e28494 convert: add support to find git copies from all files in the working copy
I couldn't think of a better name for this option, so I stole the Git one in
the hope that anyone converting a Git repo knows what it means.
2014-09-12 12:28:30 -07:00
Siddharth Agarwal
476da59ea2 convert: add support to detect git renames and copies
Git is fairly unique among VCSes in that it doesn't record copies and renames,
instead choosing to detect them on the fly. Since Mercurial expects copies and
renames to be recorded, it can be valuable to preserve this history while
converting a Git repository to Mercurial. This patch adds a new convert option,
called 'convert.git.similarity', which determines how similar files must be to
be treated as renames or copies.
2014-09-12 11:23:26 -07:00
Siddharth Agarwal
1a1fc7e9e2 convert: add initial docs for git sources
Upcoming patches will add config options for git sources. This patch adds a
place to document them.
2014-09-12 10:17:56 -07:00
Mads Kiilerich
0df22182cc convert: introduce --full for converting all files
Convert will normally only process files that were changed in a source
revision, apply the filemap, and record it has a change in the target
repository. (If it ends up not really changing anything, nothing changes.)

That means that _if_ the filemap is changed before continuing an incremental
convert, the change will only kick in when the files it affects are modified in
a source revision and thus processed.

With --full, convert will make a full conversion every time and process
all files in the source repo and remove target repo files that shouldn't be
there. Filemap changes will thus kick in on the first converted revision, no
matter what is changed.

This flag should in most cases not make any difference but will make convert
significantly slower.

Other names has been considered for this feature, such as "resync", "sync",
"checkunmodified", "all" or "allfiles", but I found that they were less obvious
and required more explanation than "full" and were harder to describe
consistently.
2014-08-26 22:03:32 +02:00
Matt Mackall
540b8eb745 help: tweak --verbose command help hint
We used to have two slightly different message which people wouldn't read...
and then complain that they couldn't find the global options or examples.

So we unify them into one message that's upfront that STUFF IS
INTENTIONALLY HIDDEN and that looks more like our normal hint style.
2014-08-12 03:01:37 -05:00
Mads Kiilerich
f92d923209 convert: backout 41e062383fc9 and 80f42131aca3 -closemap
Closemap solves a very specific use case. It would be better to have a more
generic solution than to have to maintain this forever.

Closemap has not been released yet and removing it now will not break any
backward compatibility contract.

There is no test coverage for closemap but it seems like the same can be
achieved with a simple and much more powerful custom extension:

import hgext.convert.hg
class source(hgext.convert.hg.mercurial_source):
    def getcommit(self, rev):
        c = super(source, self).getcommit(rev)
        if rev in ['''
d643f67092ff123f6a192d52f12e7d123dae229f
3a6a38229d418ba09cb7784c01453a93b4d363f8
facceca31c18f7ef800977055dbcbd7fcb5c5cb2
''']:
            c.extra = c.extra.copy()
            c.extra['close'] = '1'
        return c
hgext.convert.hg.mercurial_source = source
2014-04-16 01:10:08 +02:00
Mads Kiilerich
46b435b7d3 convert: backout 8a62813ea220 and ca6679798c95 - tagmap
Tagmap solves a very specific use case. It would be better to have a more
generic solution than to have to maintain this forever.

Tagmap has not been released yet and removing it now will not break any
backward compatibility contract.

There is no test coverage for tagmap but it seems like the same can be achieved
with a (relatively) simple and much more powerful custom extension:

import hgext.convert.hg
def f(tag):
    return tag.replace('some', 'other')
class source(hgext.convert.hg.mercurial_source):
    def gettags(self):
        return dict((f(tag), node)
                    for tag, node in in super(source, self).gettags().items())
    def getfile(self, name, rev):
        data, flags = super(source, self).getfile(name, rev)
        if name == '.hgtags':
            data = ''.join(l[:41] + f(l[41:]) + '\n' for l in data.splitlines())
        return data, flags
hgext.convert.hg.mercurial_source = source
2014-04-16 01:09:49 +02:00
Mads Kiilerich
0e8795ccd6 spelling: fixes from spell checker 2014-04-13 19:01:00 +02:00
Matt Mackall
9ca3ee752a merge with stable 2014-03-19 16:21:53 -05:00
Mads Kiilerich
f6b400481f convert: more clear documentation of the 'include' default of a 'include .'
At first glance it can be confusing that adding a superfluous include directive
will exclude more files.
2014-03-19 00:19:54 +01:00
Sean Farley
6aefcf4449 convert: add tagmap option
Tests have been updated.
2014-01-22 15:43:21 -06:00
Sean Farley
e7a8a8092c convert: add closemap option
Tests have been updated.
2014-01-21 11:35:17 -06:00
Matt Mackall
280eec0087 tests: skip tests that require not having root (issue4089)
This adds a new root hghave to test against. Almost all of these are a
subset of unix-permissions, but that is also used for checking exec
bit handling.
2013-11-14 18:07:43 -06:00
Mads Kiilerich
1692899d8d convert: introduce hg.revs to replace hg.startrev and --rev with a revset
The existing knobs for controlling which revisions to convert were often
insufficient. Revsets is a shiny hammer that provides a better solution.

Revsets has been introduced in --rev handling in a lot of other places while
being more or less backwards compatible. Doing the same here would be a much
more elegant ... but that would unfortunately not work in this case.  "--rev 7"
used to mean revision 0 to 7 - it would be an unacceptable change if it
suddenly just meant revision 7.

Instead we introduce a new configuration setting. It will only work for
Mercurial repositories so adding a new commandline option for it would not be a
nice solution.

There is no way to use the fancy deprecation markup for configuration settings
so we just remove the documentation of hg.startrev.
2013-07-20 00:43:08 +02:00
Mads Kiilerich
29b14aad65 convert: fix description of 'convert --rev' 2013-07-19 02:32:36 +02:00
Constantine Linnick
4d22c22a01 convert: add closesort algorithm to mercurial sources
If you actively work with branches, sometimes you need to close old branches
which last commited hundreds revisions ago. After close you will see long
lines in graph visually spoiling history. This sort only moves closed
revisions as close as possible to parents and does not increase storage size
as datesort do.
2013-03-24 00:06:52 +07:00
Kevin Bullock
bd7b56e105 merge with stable 2013-01-14 10:17:06 -06:00
FUJIWARA Katsunori
9fd6562bca convert: correct 'hooks' section name in online help
The section name for hooks is not 'hook', but 'hooks'.
2013-01-14 23:14:45 +09:00
Mads Kiilerich
5584e7ca16 dispatch: show empty filename in OSError aborts
Mercurial would sometimes exit with:
  abort: No such file or directory
where str of the actual OSError exception was the more helpful:
  [Errno 2] No such file or directory: ''

The exception will now always show the filename and quote it:
  abort: No such file or directory: ''
2013-01-07 02:00:29 +01:00
Julian Cowley
15e470ce7f convert: add config option to use the local time zone
The default for the time zone offset in a converted changeset has
always been 0 (UTC).  With this patch, the converted changeset is
modified so that the local offset from UTC is specified as the time
zone offset.

The option is specified as the boolean convert.localtimezone (default
False).  Example usage:

    hg convert -s cvs --config convert.localtimezone=True example-cvs example-hg

IMPORTANT: the patch only applies to conversions from cvs or svn.
The documentation for the option only appears in those two sections
in the convert help text.
2012-11-18 12:26:50 -10:00
FUJIWARA Katsunori
477c2590ac help: indicate help omitting if help document is not fully displayed
Before this patch, there is no information about whether help document
is fully displayed or not.

So, some users seem to misunderstand "-v" for "hg help" just as "the
option to show list of global options": experience on "hg help -v" for
some commands not containing verbose containers may strengthen this
misunderstanding.

Such users have less opportunity for noticing omitted help document,
and this may cause insufficient understanding about Mercurial.

This patch indicates help omitting, if help document is not fully
displayed.

For command help, the message below is displayed at the end of help
output, if help document is not fully displayed:

    use "hg -v help xxxx" to show more complete help and the global
    options

and otherwise:

    use "hg -v help xxxx" to show the global options

For topics and extensions help, the message below is displayed, only
if help document is not fully displayed:

    use "hg help -v xxxx" to show more complete help

This allows users to know whether there is any omitted information or
not exactly, and can trigger "hg help -v" invocation.

This patch causes formatting help document twice, to switch messages
one for omitted help, and another for not omitted. This decreases
performance of help document formatting, but it is not mainly focused
at help command invocation, so this wouldn't become problem.
2012-10-18 10:31:15 +09:00
Mads Kiilerich
2f4504e446 fix trivial spelling errors 2012-08-15 22:38:42 +02:00
Mads Kiilerich
9cddfd19ab check-code: fix check for trailing whitespace on sh command lines
The $ has been without necessary escaping since introduced in c4ecbbd282fe.
2012-08-08 18:10:16 +02:00
FUJIWARA Katsunori
0cf97588a4 doc: unify section level between help topics
Some help topics use "-" for the top level underlining section mark,
but "-" is used also for the top level categorization in generated
documents: "hg.1.html", for example.

So, TOC in such documents contain "sections in each topics", too.

This patch changes underlining section mark in some help topics to
unify section level in generated documents.

After this patching, levels of each section marks are:

  level0
  """"""
    level1
    ======
      level2
      ------
        level3
        ......
          level4
          ######

And use of section markers in each documents are:

  - mercurial/help/*.txt can use level1 or more
    (now these use level1 and level2)

  - help for core commands can use level2 or more
    (now these use no section marker)

  - descriptions of extensions can use level2 or more
    (now hgext/acl uses level2)

  - help for commands defined in extension can use level4 or more
    (now "convert" of hgext/convert uses level4)

"Level0" is used as top level categorization only in "doc/hg.1.txt"
and the intermediate file generated by "doc/gendoc.py", so end users
don't see it in "hg help" outoput and so on.
2012-07-25 16:40:38 +09:00
Mads Kiilerich
377db36818 help: fix some instances of 'the the' 2012-07-26 02:54:13 +02:00
Matt Harbison
5c29c87aee revset: add a predicate for finding converted changesets
This selects changesets added because of repo conversions.  For example

    hg log -r "converted()"      # all csets created by a convertion
    hg log -r "converted(rev)"   # the cset converted from rev in the src repo

The converted(rev) form is analogous to remote(id), where the remote repo is
the source of the conversion.  This can be useful for cross referencing an old
repository into the current one.

The source revision may be the short changeset hash or the full hash from the
source repository.  The local identifier isn't useful.  An interesting
ramification of this is if a short revision is specified, it may cause more
than one changeset to be selected.  (e.g. converted(6) matches changesets with
a convert_revision field of 6e..e and 67..0)

The convert.hg.saverev option must have been specified when converting the hg
source repository for this to work.  The other sources automatically embed the
converted marker.
2012-05-13 01:12:26 -04:00
Mads Kiilerich
f7215835a1 tests: convert some hghave unix-permissions to #if 2012-06-19 01:43:41 +02:00
Matt Mackall
31ad230c52 pull: backout change to return code
This bit a number of people.
2012-02-10 16:09:30 -06:00
Matt Mackall
902bee47a8 pull: return 1 when no changes found (BC)
Currently we have the following return codes if nothing is found:

                commit   incoming    outgoing      pull     push
intended           1        1           1            1       1
documented         1        1           1            0       1
actual             1        1           1            0       1

This makes pull agree with the rest of the table and makes it easy to
detect "nothing was pulled" in scripts.
2012-01-30 16:01:54 -06:00
Olav Reinert
cfc7a5074e minirst: simplify and standardize field list formatting
The default width of field lists is changed from 12 to 14 to align minirst with
the rst2html tool. Shrinking the width of the left column to fit the content is
removed, to keep formatting simple and uniform.
2012-01-11 18:08:25 +01:00
Mads Kiilerich
72ebeacea3 tests: use 'hghave unix-permissions' for tests that really use chmod
chmod of helper scripts is not included.

tests that exercise the x bit in the file system uses 'hghave execbit'.
2011-11-07 03:14:55 +01:00
Eli Carter
bad1c40c51 convert: fix typo 2011-10-18 10:32:12 -05:00
Matt Mackall
58c81fa35e help: unify the two -v notes for command help 2011-10-07 16:36:54 -05:00