Commit Graph

213 Commits

Author SHA1 Message Date
Siddharth Agarwal
d239f557d1 gitrepo.listkeys: use githandler from localrepo
Previously we'd load the git and hg maps twice on separate git handler objects.
This avoids that.

For a repo with over 50,000 commits, this brings a no-op hg pull down from 2.45
seconds to 2.37.
2014-02-19 15:07:19 -08:00
Siddharth Agarwal
772133c48a hgrepo.tags: use githandler property
Currently we call hgrepo.tags() separately for each tag. (This should be fixed
at some point.) This avoids initializing a separate git handler for each tag.

For a repository with over 150 tags, this brings down a no-op hg pull by 0.05
seconds.
2014-02-19 14:16:40 -08:00
Siddharth Agarwal
232c6612ae hgrepo._findtags: use githandler property 2014-02-19 14:15:33 -08:00
Siddharth Agarwal
1c6dc044d5 hgrepo.findoutgoing: use githandler property 2014-02-19 14:14:54 -08:00
Siddharth Agarwal
f87f28dc3a hgrepo.push: use githandler property 2014-02-19 14:14:01 -08:00
Siddharth Agarwal
63f40c5059 hgrepo.pull: use githandler property 2014-02-19 14:12:38 -08:00
Siddharth Agarwal
728f8df8de hgrepo: expose git handler as a property
This and upcoming patches have the goal of initializing a GitHandler just once
for a Mercurial repo.
2014-02-19 14:12:03 -08:00
Siddharth Agarwal
7d37b2a516 git_handler: terminate new commit DAG traversal at known commits
Any commit in _map_git is already known, so there's no point walking further
down the DAG.

For a repo with over 50,000 commits, this brings down a no-op hg pull from 38
seconds to 2.5.
2014-02-18 20:30:27 -08:00
Siddharth Agarwal
6f79df86d2 git_handler: use convert_list to cache git objects
getnewgitcommits() does a weird traversal where a particular commit SHA is
visited as many times as the number of parents it has, effectively doubling
object reads in the standard case with one parent. This patch makes the
convert_list a cache for objects, so that a particular Git object is read just
once.

On a mostly linear repository with over 50,000 commits, this brings a no-op hg
pull down from 70 seconds to 38, which is close to half the time, as expected.
Note that even a no-op hg pull currently does a full DAG traversal -- an
upcoming patch will fix this.
2014-02-18 20:22:13 -08:00
Siddharth Agarwal
36052aca77 git_handler: note that new commits are returned in topo order
This wasn't obvious to me at first.
2014-02-18 20:13:15 -08:00
Siddharth Agarwal
5e72b26e7b git_handler: fix progress reset call 2014-02-16 01:13:10 -08:00
Siddharth Agarwal
298fec2a4b git_handler: use repo.changelog.node instead of repo.lookup
For a repo with over 50,000 commits, this brings down the computation of
'export' from 1.25 seconds to 0.25 seconds.

To scale this to hundreds of thousands of commits, one solution might be to
maintain the mapping in a DAG data structure mirroring the changelog, over
which findcommonmissing can be used.
2014-02-16 01:11:47 -08:00
Siddharth Agarwal
d7dbce79bd hg2git: call _handle_subrepos when .hgsubstate is removed
Now that _handle_subrepos can handle .hgsubstate being removed, we should use
it for that.

The test changes make sure that the SHAs roundtrip.
2014-02-12 22:55:16 -08:00
Siddharth Agarwal
39d1c15298 hg2git: make _handle_subrepos worked in the removed case
A test for this will be included in an upcoming patch.
2014-02-12 21:19:04 -08:00
Siddharth Agarwal
ca74d6d967 hg2git: add 'new' prefix to _handle_subrepos variables
An upcoming patch will introduce similar variables for self._ctx. This helps
disambiguate.
2014-02-12 20:34:09 -08:00
Siddharth Agarwal
3cadf19b94 hg2git: factor out subrepo parsing into a separate function
This code will be used in multiple contexts in an upcoming patch.
2014-02-12 20:28:28 -08:00
Siddharth Agarwal
44c13be822 hg2git: factor out remove path logic into a separate function
This will be used by _handle_subrepos in an upcoming patch.
2014-02-12 19:50:56 -08:00
Siddharth Agarwal
94957f9a66 git_handler: remove collect_gitlinks now that it is unused 2014-02-15 16:21:49 -08:00
Siddharth Agarwal
8d0c4fe9f2 git_handler: fix hgsubstate generation
Before this patch, in the git to hg conversion, .hgsubstate once created is
never deleted, even if no submodules are any longer present. This is broken
state, as shown by the test for which the SHA changes. Fix that by looking at
the diff instead of just what submodules are present.

Since 'gitlinks' now contains *changed* gitlinks, not *all* gitlinks, it no
longer makes sense to gate gitmodules checks on that.

This patch simply demonstrates that the test was broken; an upcoming patch will
introduce more tests.

Bonus: this also makes the import process faster because we no longer need to
walk the entire tree to collect gitlinks.

This will cause the SHAs of repos that have submodules added and then removed
to change.
2014-02-14 15:44:50 -08:00
Siddharth Agarwal
94f67b719d git_handler: move check for gparents in repo to start of import_git_commit
Also drop Mercurial < 1.5 support.
2014-02-14 16:16:25 -08:00
Siddharth Agarwal
82b4618e1f git_handler: move gparents initialization up to start of import_git_commit
gparents will be used to compute .hgsubstate in an upcoming patch.
2014-02-14 13:15:45 -08:00
Siddharth Agarwal
35411ae613 git_handler: return gitlinks in get_files_changed
Currently, to figure out which gitlinks are in a repository we walk through the
entire tree. This patch lets us use get_files_changed to detect which gitlinks
have changed.
2014-02-14 11:31:54 -08:00
Siddharth Agarwal
873a402c5e hg2git: call status on newctx, not newctx.rev()
There's no benefit to calling rev().
2014-02-12 18:05:12 -08:00
Siddharth Agarwal
17657a025c hg2git: store ctx instead of rev
Storing a ctx enables values like manifests to be cached on the context.
2014-02-12 17:49:14 -08:00
Siddharth Agarwal
b470bfcf51 hg2git: rename ctx to newctx in update_changeset
An upcoming patch will introduce a new field called _ctx. This helps prevent
confusion.
2014-02-12 17:47:38 -08:00
Dov Feldstern
97838c2daf fallback to unauthenticated http(s) access when using older dulwich 2014-02-13 02:00:18 +02:00
Dov Feldstern
7e0c3b141b support for http(s) basic authentication
This is an adaptation of the original patch submitted in [1], without the
monkey-patching: a patch has been committed in dulwich [2] which allows clients
to supply a custom urllib2 "opener" for opening the url; here, we provide such
an opener, which provides authentication information obtained from the hg
config.

[1] https://groups.google.com/forum/#!topic/hg-git/9clPr1wdtiw
[2] https://bugs.launchpad.net/dulwich/+bug/909037
2014-02-13 01:37:22 +02:00
Siddharth Agarwal
a06bbdeeac git_handler: don't bail on multiple octopus merges in succession
Consider two octopus merges, one of which is a child of the other. Without this
patch, get_git_parents() called on the second octopus merge checks that each p1
is neither in the middle of an octopus merge nor the end of it. Since the end
of the first octopus merge is a p1 of the second one, this asserts.

Change the sanity check to only make sure that p1 is not in the middle of an
octopus merge.
2014-02-11 22:13:34 -08:00
Jordi Gutiérrez Hermoso
f4e623cf8f gitdirstate: import errno for handling OSError
When handling OSError while visiting subdirectories, we're checking
errno, but we never imported this module. This small patch fixes this.
2014-02-07 10:43:49 -05:00
anatoly techtonik
a09eed97b0 git_handler.py: less cryptic error message when push fails 2013-12-15 15:19:22 -05:00
Augie Fackler
1dfa836781 testedwith: drop 2.3.1, which has at least one test failure 2013-12-14 11:59:39 -05:00
Augie Fackler
e94e063d88 testedwith: add 2.8.1 2013-12-14 11:19:39 -05:00
Augie Fackler
7dc6835322 gitignore: gate feature on dirstate having rootcache and ignore having readpats 2013-12-14 11:19:25 -05:00
Augie Fackler
6d7d1ac665 git_handler: iterate over new refs in sorted order to stabilize test output
An earlier patch already fixes the test expectations (oops), so this
just makes sure the tests always pass.
2013-12-13 13:02:08 -05:00
Augie Fackler
58fd252e1f overlay: add kludge to make sure we only ever give hexshas to dulwich 2013-12-13 12:54:39 -05:00
Jordi Gutiérrez Hermoso
adf3575aa8 git-handler: turn refs from None to {} so that empty git repos can convert 2013-12-03 16:55:17 -05:00
Ben Kehoe
6f094a5bfe Fix for #68 | Use .gitignore files (with proper semantics) 2013-11-27 09:27:59 -05:00
Augie Fackler
29a7a3aee8 overlays: fix incoming support for hg 2.8
This was crafted mostly via a bunch of aimless flailing in the
code. I'm pretty well convinced at this point that the incoming
support needs to be rewritten slightly to behave properly in the new
world order (specifically, the overlayrepo class probably should be
subclassing localrepo, or else more directly reimplementing things
instead of trying to forward methods.)
2013-10-05 17:40:50 -04:00
Augie Fackler
6f0e54cb16 Merge a work-around for a bug in dulwich.
I've been waiting for dulwich upstream to fix this *and* for a test
from domruf that's acceptable. Having gotten neither over a period of
/months/, and having hit the bug myself, I'm moving on and accepting a
patch without tests. This will likely break again, but hopefully
before we'd break it dulwich will be fixed.
2013-09-17 09:59:36 -04:00
Augie Fackler
df2fed070b git_handler: fix bugs introduced by 93aaff49e601 which could never have passed tests 2013-09-17 09:58:36 -04:00
Augie Fackler
3460706765 git_handler: clean up trailing whitespace 2013-09-17 09:58:12 -04:00
Alex Regueiro
a5c98b616b Upgraded to use latest version of dulwich (0.9.1). 2013-09-13 01:42:27 +01:00
Augie Fackler
ff1e9014cf merge 2013-08-28 13:49:01 -04:00
nsuke
291493c743 git_handler: skip exporting hg tags whose names are not valid as git tag name 2013-08-12 23:20:41 +09:00
Risto Kankkunen
e3f5583d09 Make the path part of URL contain a leading slash only if it's not followed by tilde. (issue #71) 2013-07-11 00:22:07 +03:00
André Felipe Dias
071243136a Fixes #54 | option branch_bookmark_suffix doesn't move bookmarks along
Test case based on the one proposed by David Carr at
https://bitbucket.org/durin42/hg-git/issue/54/with-option-branch_bookmark_suffix-set
2013-07-01 16:04:53 -03:00
Gregory Szorc
10dcc5b5c0 Only export modified Git trees
Previously, we emitted every Git tree when updating between Mercurial
changesets. With this patch, we now only emit Git trees that changed. A
side-effect of the implementation is that we now only update in-memory
Git trees objects that changed. Before, we always touched Git trees,
invalidating them in the process and causing Dulwich to recalculate
their SHA-1. Profiling revealed this to be expensive and removing the
extra calculation shows a nice performance win.

Another optimization is to not sort the order that changed paths are
processed in. Previously, we sorted by length, longest to shortest.
Profiling revealed that the sorts took a non-trivial amount of time.
While sorted execution resulted in likely idempotent behavior, it
shouldn't be strictly required.

On the author's machine, conversion of the Mercurial repository itself
decreased from ~493s to ~333s. Even more impressive is conversion of
Firefox's main repository (which is considerably larger). Converting the
first 200 revisions of that repository decreased from ~152s to ~42s.
2013-04-14 11:11:41 -07:00
Augie Fackler
66478492e0 overlaymanifest: add iteritems(), used by recent hg versions 2013-04-03 14:37:13 -05:00
Gregory Szorc
baa19027ef Export Git objects from incremental Mercurial changes
This replaces the brute force Mercurial to Git export with one that is
incremental. It results in a decent performance win and paves the road
for parallel export via using multiple incremental exporters.
2013-03-19 22:44:01 -07:00
Hal Wine
e83de5b7dc scrub bad timezone values before dulwich sees them
If dulwich is presented with a "sub minute" timezone offset, it throws
an exception (see tests/test-timezone.t). This patch rounds the timezone
down to the next minute before passing the value to dulwich.
2013-02-05 08:25:37 -08:00