Source and destination constructors should be fast so configurations issues are
hit quickly, including authentication and filemap/authormap/splicemap issues.
Delaying might be a problem if the remove side disconnects idle connections
while the log is being read. It did not happen when converting openafs
repository, where log retrieval took at least 10mn.
They are unnecessary. I did leave them in localrepo.py where there is
something like:
_junk = foo()
_junk = None
to free memory early. I don't know if just `foo()` will free the return
value as early.
In the cmdtable for the convert extension, the default value for splicefile is
empty string, while mapfile (the class that reads splicemaps) expects either a
real path or None. This patch changes mapfile to expect a real path or logical
false (False, None, empty string, etc.)
- create error.py for exception classes to reduce demandloading
- move revlog exceptions to it
- change users to import error and drop revlog import if possible
Changes cvsps.py's cvs log reader to use a one-line lookahead, so
that possibly misleading log messages can be disambiguated. In
particular I have past committers who used cvs log's 28-character
row of hyphens within commit messages; this throws cvsps and disrupts
conversion. The only alternative in this case is to edit the cvs
,v file by hand, which bloodies mercurial's "don't change history"
principle.
Built on top of previous patches:
- continuation-of parsing
- registered archives retrieval
- use of fully qualified revisions
This allows the converter scanning for more source revisions
following the tree versions 'leaked' through the continuation-of
informations. Coupled with the registered archives retrieval, this
makes possible to decide to follow such a hint or stop scanning for
more revisions.
This also implies some changes in the retrieval of some base-0
revisions when they're continuation-of other revisions, in that
case a 'replay' will work where a simple 'get' fails because the
dir exists already. I found the code dealing with 'replay' quite
good as it has already a fallback to 'get' in the error path.
In GNU Arch, continuation-of was often used for:
- tagging revisions
- continue working on a project in a new archive, because arch
was scaling poorly in revision numbers (cat-logs were slow
to be parsed and scanned through)
- very similar to the previous point, fork his own branch of
a project.
Parsing this header information will allow to 'follow' new history
because it often hints at older/forked/personal revision trees.
This patch however just implements the parsing of the
continuation-of header. A followup patch will implement the proper
use of this new information.
There is no need loosing information in the conversion process. This could
lead to wrong shamap mappings if different archives used the same 'version'
naming.
GNU Arch used to scale very poorly when revision number was
increasing. This was mostly caused by the huge amount of
cat-log it has to scan/read through to keep track of all
patches that were merged in a given revision.
In order to improve things, cat-log prunning was a common
admin task that would accelerate cat-log parsing at the expense
of unreachabe locally stored cat-logs.
However, these missing cat-logs are still available in the archive.
So try to get them from the archive as a fallback solution.
cat-log parsing was very wrong. It assumed the Summary header
was comming last, which is wrong. Plus the code was buggy because
it was concatenating all headers in the summary.
As parsing GNU Arch isn't trivial, and python email code does it
so well... just use that ;-)
Changes docstrings to begin with a lowercase word. Only docstrings
used in help output is changed.
Scripts are not expected to grep the output of 'hg help' so this
change should pose no problem with regard to the compatibility rules.
This change is brain damaged, there is no reason the copyfrom revision of the
project items may have any relevance when deciding the revision parent. It is
meaningful only when fetching files content.
Incorrect converted graph was spotted in pyglet svn repository at:
------------------------------------------------------------------------
r274 | r1chardj0n3s | 2006-12-21 02:02:14 +0100 (Jeu, 21 Dec 2006) | 2 lines
Changed paths:
A /branches/richard-glx-version (from /trunk:269)
M /branches/richard-glx-version/pyglet/window/xlib/__init__.py
R /branches/richard-glx-version/tests/test.py (from /trunk/tests/test.py:270)
R /branches/richard-glx-version/tools/info.py (from /trunk/tools/info.py:272)
R /branches/richard-glx-version/website/get_involved.php (from /trunk/website/get_involved.php:273)
Branching to horribly mangle GLX
cvsps version 2.2b1 as found in Fedora 10 outputs the following format:
---------------------
PatchSet 1
Date: 2008/11/26 00:59:46
Author: mk
Branch: HEAD
Tag: (none)
Branches: INITIAL
Log:
Initial revision
Members:
a:INITIAL->1.1
b/c:INITIAL->1.1
---------------------
The parser overwrote the Branch value with noise from the misparsed Branches
value.
Former code failed when tracking child directories we assumed were renamed with
their parents but were really created in the tags directory. This happens in
jQuery repository with /tags/ui/1.5b4/release@5455.
When converting git repos, all stuff happening on branches
seems to be ignored.
This is caused by the fact a "git clone" of a remote git
repo has all its branches prefixed with "origin/". By
chance, the "origin/master" branch is always linked to a
local "master" branch. So getheads() returns only the
master head, and it ignores all the other heads.
Make sure getheads() returns all heads, forcing remote
branches to be return by git-rev-parse.
The latest GIT has some changes in the way it is installed. Only the 'git'
executable need to be in the path. All other commands are treated as sub
commands of 'git'.
Fix identified by frank@kingswood-consulting.co.uk
Changed usage fron os.environ["HOME"] to expanduser("~/.cvspass") as
this is the
only usage of this construct in mercurial sources.
Subversion allows revisions to be composed of subparts coming from revisions
before the parent or from other part of the repository. There is no simple
representation for these now, keep the changes but do not track their origins.
With huge history (like kdelibs), the process termination suddenly consumes a
lot of memory (from 700M to 1.3G+). Since the job is done, clean termination is
not required, just exit.
- eluding convert.svn.branches defaults to "branches"
- convert.svn.branches= disables branches detection
- convert.svn.branches=/ is equivalent to former convert.svn.branches=
Two branches a and b starting at root, with commits interleaved like:
root a0 a1 b0 a2 a3 b1
were converted in the following order:
root a0 b0 a1 b1 a2 a3
Replace depth based toposort with a more classic traversal method.
The problem is that previously commit.date was used for sorting, but it's a
string like "1 Jan xxx 2007", so it it wrong to use it for sorting.
Another problem is that why we are using depth for sorting -- I have no clear
answer -- it seems to be plain wrong.
This patch is just an RFC.
Previously the parent was determined by the last changeset where the branched
file was changed even if the branch is based on an earlier revision.
Fix written by mpm.
This should give the user a better hint of what's going wrong.
Improve some error messages. In particular, mention "CVS checkout" instead
of "CVS repo".
Fixes issue822 and issue826.
Add Mercurial as a source format, clarify that the include directive triggers the exclusion of all not explicitely included files/dirs and use MAPFILE instead of revmapfile in the text, following the short message convention.
Better handles this case:
The output from cvsps -A -u --cvs-direct -q:
---------------------
PatchSet 1
Date: 2008/02/08 20:33:28
Author: fk
Branch: HEAD
Tag: (none)
Log:
initial
Members:
file_one:INITIAL->1.1
---------------------
PatchSet 2
Date: 2008/02/08 20:33:32
Author: fk
Branch: branch_name
Ancestor branch: HEAD
Tag: (none)
Log:
new file on branch
Members:
file_two:1.1->1.1.2.1
- Improve performance by reading 'replay' output instead of
calling 'delta' command after 'replay'. This increases speed
significantly.
- Some times 'replay' command might fail with conflicts (don't
know why), a new get from that revision just fixes it. So,
if something fails, get a fresh copy from that revision and
try from there.