For a repo with over 50,000 commits, this brings down the computation of
'export' from 1.25 seconds to 0.25 seconds.
To scale this to hundreds of thousands of commits, one solution might be to
maintain the mapping in a DAG data structure mirroring the changelog, over
which findcommonmissing can be used.
Before this patch, in the git to hg conversion, .hgsubstate once created is
never deleted, even if no submodules are any longer present. This is broken
state, as shown by the test for which the SHA changes. Fix that by looking at
the diff instead of just what submodules are present.
Since 'gitlinks' now contains *changed* gitlinks, not *all* gitlinks, it no
longer makes sense to gate gitmodules checks on that.
This patch simply demonstrates that the test was broken; an upcoming patch will
introduce more tests.
Bonus: this also makes the import process faster because we no longer need to
walk the entire tree to collect gitlinks.
This will cause the SHAs of repos that have submodules added and then removed
to change.
Currently, to figure out which gitlinks are in a repository we walk through the
entire tree. This patch lets us use get_files_changed to detect which gitlinks
have changed.
This is an adaptation of the original patch submitted in [1], without the
monkey-patching: a patch has been committed in dulwich [2] which allows clients
to supply a custom urllib2 "opener" for opening the url; here, we provide such
an opener, which provides authentication information obtained from the hg
config.
[1] https://groups.google.com/forum/#!topic/hg-git/9clPr1wdtiw
[2] https://bugs.launchpad.net/dulwich/+bug/909037
Consider two octopus merges, one of which is a child of the other. Without this
patch, get_git_parents() called on the second octopus merge checks that each p1
is neither in the middle of an octopus merge nor the end of it. Since the end
of the first octopus merge is a p1 of the second one, this asserts.
Change the sanity check to only make sure that p1 is not in the middle of an
octopus merge.
I've been waiting for dulwich upstream to fix this *and* for a test
from domruf that's acceptable. Having gotten neither over a period of
/months/, and having hit the bug myself, I'm moving on and accepting a
patch without tests. This will likely break again, but hopefully
before we'd break it dulwich will be fixed.
This replaces the brute force Mercurial to Git export with one that is
incremental. It results in a decent performance win and paves the road
for parallel export via using multiple incremental exporters.
If dulwich is presented with a "sub minute" timezone offset, it throws
an exception (see tests/test-timezone.t). This patch rounds the timezone
down to the next minute before passing the value to dulwich.
As pointed out by l33t, Hg-Git's output for push doesn't currently do a very
good job of telling the user what happened. My previous changes in this area
had moved some of the output from status to note, making it only show if
--verbose was specified. However, I hadn't realized at the time that the
reference information (though overly verbose) was providing a valueable purpose
that otherwise wasn't met; telling the user that a remote reference had changed.
This changeset makes it so that:
* default output will include simple messages like "adding reference
refs/heads/feature" and "updating reference refs/heads/master" (omitting any
mention of unchanged references)
* verbose output will include more detailed messages like "adding reference
default::refs/heads/feature => GIT:aba43c" and "updating reference
default::refs/heads/master => GIT:aba43c" (omitting any mention of unchanged
references)
* debug output will include the detailed output like in verbose, but
addtionally will include messages like "unchanged reference
default::refs/heads/other => GIT:aba43c"
https://bitbucket.org/durin42/hg-git/issue/64/push-confirmation
l33t pointed out that currently, Hg-Git doesn't provide any confirmation that a
push was successful other than the exit code. Normal Mercurial provides a
couple other messages followed by "added X changesets with Y changes to
Z files". After this change, Hg-Git will provide much more similar output.
It's not identical, as the underlying model is substantially different, but the
concept is the same. The main message is "added X commits with Y trees and
Z blobs".
This change doesn't affect the output of what references/branches were touched.
That will be addressed in a subsequent commit.
Dulwich doesn't provide an easy hook to get the information needed for this
output. Instead of passing generate_pack_contents as the pack generator
function to send_pack, I pass a custom function that determines the "missing"
objects, stores the counts, and then calls generate_pack_contents (which then
will determine the "missing" objects again.
The new expected output:
searching for changes # unless quiet true
<N> commits found # if verbose true
list of commits: # if debugflag true and at least one commit found
<each hash> # if debugflag true and at least one commit found
adding objects # if at least one commit found unless quiet true
added <N> commits with <N> trees and <N> blobs # if at least one object unless
# quiet true
https://bitbucket.org/durin42/hg-git/issue/64/push-confirmation
When communicating with the user on push/outgoing, Mercurial doesn't show a
"exporting hg objects to git" message, so we shouldn't. The message has been
changed to be shown if --verbose is specified.
When communicating with the user on push, Mercurial doesn't show much on
success. Currently, Hg-Git shows every changed ref. After this change,
the default output will more closely match Mercurial's regular behavior (no
per-ref output), while changed refs will be shown if --verbose is specified,
and all refs will be shown if --debug is specified.
This changeset adds test coverage for comparing "hg outgoing -B" in normal
Mercurial usage with Hg-Git usage. This didn't match, since previously, gitrepo
didn't provide a meaningful listkeys implementation. Now, it does.
gitrepo now has access to a GitHandler when a localrepo is available. This
handler is used to access the information needed to implement listkeys for
namespaces (currently, only bookmarks) and bookmarks.
A couple of other tests were testing "divergent bookmark" scenarios. These
tests have been updated to filter out the divergent bookmark output, as it isn't
consistent across the supported Mercurial versions.
In the logic that was attempting to handle the case where the local repo doesn't
have any bookmarks, the assumption was being made that tip resolved to a
non-null revision. In the case of a totally empty local repo, however, that
isn't a valid assumption, and resulted in attempting to set the master ref
to None, which broke dulwich.
The "fix", which avoids the traceback and allows the push to complete (though
still do nothing, since in this case there aren't any changes to push), is to
not tweak the refs at all if tip is nullid. Leaving the special capabilities
ref and not adding a master ref appears to be fine in this case.
The output for "hg push" when there were no changes didn't quite match between
Mercurial with and without Hg-Git, so I changed the behavior to bring it into
synch. The existing "creating and sending data" message was changed to be
included if --verbose is specified.
There was a bug introduced in fa5f235be2cd such that calling hg outgoing on
a Git repository would result in all refs being deleted from the remote
repository (with the possible exception of the currently checked out branch).
It wasn't noticed before because the existing test for outgoing didn't actually
verify the refs on the remote. This changeset fixes the bug, as well as adding
test coverage to allow verifying that the fix works.
When exporting Git commits, verify that the tree and parents objects
exist in the repository before allowing the commit to be exported. If a
tree or parent commit is missing, then the repository is not valid and
the export should not be allowed.