This was crafted mostly via a bunch of aimless flailing in the
code. I'm pretty well convinced at this point that the incoming
support needs to be rewritten slightly to behave properly in the new
world order (specifically, the overlayrepo class probably should be
subclassing localrepo, or else more directly reimplementing things
instead of trying to forward methods.)
I've been waiting for dulwich upstream to fix this *and* for a test
from domruf that's acceptable. Having gotten neither over a period of
/months/, and having hit the bug myself, I'm moving on and accepting a
patch without tests. This will likely break again, but hopefully
before we'd break it dulwich will be fixed.
Previously, we emitted every Git tree when updating between Mercurial
changesets. With this patch, we now only emit Git trees that changed. A
side-effect of the implementation is that we now only update in-memory
Git trees objects that changed. Before, we always touched Git trees,
invalidating them in the process and causing Dulwich to recalculate
their SHA-1. Profiling revealed this to be expensive and removing the
extra calculation shows a nice performance win.
Another optimization is to not sort the order that changed paths are
processed in. Previously, we sorted by length, longest to shortest.
Profiling revealed that the sorts took a non-trivial amount of time.
While sorted execution resulted in likely idempotent behavior, it
shouldn't be strictly required.
On the author's machine, conversion of the Mercurial repository itself
decreased from ~493s to ~333s. Even more impressive is conversion of
Firefox's main repository (which is considerably larger). Converting the
first 200 revisions of that repository decreased from ~152s to ~42s.
This replaces the brute force Mercurial to Git export with one that is
incremental. It results in a decent performance win and paves the road
for parallel export via using multiple incremental exporters.
A recent real world occurrence - user hand edited the timezone field in
an hg export to provide a unique value (from prior export). Hg imported
the export okay, but dulwich threw an exception.
This test shows the fault.
If dulwich is presented with a "sub minute" timezone offset, it throws
an exception (see tests/test-timezone.t). This patch rounds the timezone
down to the next minute before passing the value to dulwich.
As pointed out by l33t, Hg-Git's output for push doesn't currently do a very
good job of telling the user what happened. My previous changes in this area
had moved some of the output from status to note, making it only show if
--verbose was specified. However, I hadn't realized at the time that the
reference information (though overly verbose) was providing a valueable purpose
that otherwise wasn't met; telling the user that a remote reference had changed.
This changeset makes it so that:
* default output will include simple messages like "adding reference
refs/heads/feature" and "updating reference refs/heads/master" (omitting any
mention of unchanged references)
* verbose output will include more detailed messages like "adding reference
default::refs/heads/feature => GIT:aba43c" and "updating reference
default::refs/heads/master => GIT:aba43c" (omitting any mention of unchanged
references)
* debug output will include the detailed output like in verbose, but
addtionally will include messages like "unchanged reference
default::refs/heads/other => GIT:aba43c"
https://bitbucket.org/durin42/hg-git/issue/64/push-confirmation
l33t pointed out that currently, Hg-Git doesn't provide any confirmation that a
push was successful other than the exit code. Normal Mercurial provides a
couple other messages followed by "added X changesets with Y changes to
Z files". After this change, Hg-Git will provide much more similar output.
It's not identical, as the underlying model is substantially different, but the
concept is the same. The main message is "added X commits with Y trees and
Z blobs".
This change doesn't affect the output of what references/branches were touched.
That will be addressed in a subsequent commit.
Dulwich doesn't provide an easy hook to get the information needed for this
output. Instead of passing generate_pack_contents as the pack generator
function to send_pack, I pass a custom function that determines the "missing"
objects, stores the counts, and then calls generate_pack_contents (which then
will determine the "missing" objects again.
The new expected output:
searching for changes # unless quiet true
<N> commits found # if verbose true
list of commits: # if debugflag true and at least one commit found
<each hash> # if debugflag true and at least one commit found
adding objects # if at least one commit found unless quiet true
added <N> commits with <N> trees and <N> blobs # if at least one object unless
# quiet true
https://bitbucket.org/durin42/hg-git/issue/64/push-confirmation
In f32e473ff520, the "commit" function was extracted into a testutil for re-use.
However, test-encoding.t was skipped over in that changeset, as I was seeing
unexplained test failures. Since those test failures have now been explained
(and fixed), this changeset performs the same extraction on test-encoding.t as
was done on all the other tests.
The version of fn_git_commit that was used in testutil redirected all output
(including errors) to /dev/null, which didn't match the expectations of this
test. The test utility functions for commit/tag now no longer throw away error
output, instead leaving it to individual tests to decide if error output should
be ignored.
It looks like Git 1.8.0 started silently converting latin1 commit messages to
utf-8. That changed the result of this test. This changeset alters the test
to make it accept both the pre-1.8.0 and post-1.8.0 behaviors.
https://raw.github.com/git/git/master/Documentation/RelNotes/1.8.0.txt
This test had some form of legacy hash filtering, marked with a TODO to remove
it when we're only supporting Mercurial 1.5 or later. Well, that time has
come, so I removed it.
This test was only running on Mercurial 1.7 or later. Since now we only
support versions that are 1.7 or later, there isn't a need to perform this
check any more.
Now that we're in the unified test format, there isn't a need to use echo
to provide context to command output. This technique actually ends up resulting
in redundant output. To preserve the original context, but eliminate the
redundancy, such echo statements have been converted into comment lines.
Mercurial allows specifying which repository to use via the -R/--repository
option. Git allows a similar function using the --git-dir option. By using
these options, in many cases we can avoid checking the current directory.
This makes tests easier to understand, as you don't need to remember which
directory you're in to understand what's going on. It also makes tests easier
to write, as you don't need to remember to cd out of a directory when you're
done doing things there.
Thanks to Felipe Contreras for the patch which this was based on.
One or both of these requirements were in almost every test in exactly the same
way. Now, these checks are performed in every test that uses the testutil.
This makes it easier for test authors to add these checks into new tests (just
add a reference to the testutil, which you'd probably want anyway).
We considered having each test declare their requirements (currently, either
"git" or "dulwich"), but in this case, preferred the simplicity of having the
check always performed (even if a particular test doesn't need one or the
other). You can't perform any meaningful testing of Hg-Git without both of
these dependencies properly configured. The main value to checking for them
in the tests (rather than just letting the tests fail) is that it gives a
meaningful error message to help people figure out how to fix their environment.
In the case that either git or dulwich is missing, the information will be
just as clearly conveyed regardless of whether its all the tests that are
skipped, or just most of them.
I didn't add dulwich to hghave (even though this is clearly the sort of thing
that hghave is intended for) because hghave is currently pulled from Mercurial
completely unchanged, and it's probably best to keep it that way.
Tested by running the tests in three configurations:
* No dulwich installed (ran 0, skipped 28, failed 0, output:
Skipped *: missing feature: dulwich)
* Bad git on path (ran 1, skipped 27, failed 0, output:
Skipped *: missing feature: git command line client)
* Working git and correct version of dulwich installed
(ran 28, skipped 0, failed 0)
Thanks to Felipe Contreras for the idea to extract this logic into a library.
It's functionally equivalent to create a directory, cd into it, git init, and
cd out of the directory, or simply git init with the directory specified.
In several cases, we were doing the former without performing any other
operations in the git repo, which just made the test unneccesarily complex.
Even in the case where we still want to cd into the directory, calling git
init with the directory name eliminates the need for a separate mkdir command.
This changeset converts the former approach to the latter with the goal of
increasing the readability of the tests.
Thanks to Felipe Contreras for the patch which this was based on.
Previously, if dulwich wasn't available, this test would fail with a traceback
(example included below). This changeset makes it so that the test will be
skipped with an informative message if dulwich isn't available.
Traceback (most recent call last):
File "/Users/carrd/hg-repos/hg-git-queue/tests/test-url-parsing.py", line 6, in <module>
from hggit.git_handler import GitHandler
File "/Users/carrd/hg-repos/hg-git-queue/tests/../hggit/__init__.py", line 42, in <module>
import gitrepo, hgrepo
File "/Users/carrd/hg-repos/hg-git-queue/tests/../hggit/gitrepo.py", line 13, in <module>
from git_handler import GitHandler
File "/Users/carrd/hg-repos/hg-git-queue/tests/../hggit/git_handler.py", line 4, in <module>
from dulwich.errors import HangupException, GitProtocolError, UpdateRefsError
ImportError: No module named dulwich.errors
Thanks to Felipe Contreras for the patch which this was based on.
The functions were renamed to make it clearer that these are shell functions
rather than normal git/hg commands, and to make it clearer which tool is being
invoked.
Old name | New name
------------------------
commit | fn_git_commit
tag | fn_git_tag
hgcommit | fn_hg_commit
hgtag | fn_hg_tag
Extraction from test-encoding.t was left for a subsequent patch, as I was seeing
unexpected output changes when I attempted the extraction.
The gitcommit and hgcommit functions in test-bookmark-workflow.t were left
as-is for now, as they have a different behavior than the standard version
(separate counters for each).
Thanks to Felipe Contreras for the patch which this was based on.
Even though the MQ extension was only used in a single test
(test-pull-after-strip.t), I included it in the testutil. It shouldn't hurt
anything to have it enabled and not used, and saves us from having to deal
with enabling extensions in individual tests at all.
Similarly, this changeset results in the graphlog extension being enabled
for all tests, even though there were some that didn't use it before. This is
even less significant in Mercurial 2.3+, since in those versions, graphlog is
part of core, and is available even when the extension is disabled.