In large repos, the existing manifest fastdelta computation (which performs a
bisect on the raw manifest for every file that is changing), is excessively
slow. This patch makes fastdelta fallback to the normal string delta algorithm
if the number of changes is large.
On a large repo with a commit of 8000 files, this reduces the commit time by 7
seconds (fastdelta goes from 8 seconds to 1).
I tested this change by modifying the function to compare the old and the new
values and running the test suite. The only difference is that the pure
text-diff algorithm sometimes produces smaller (but functionaly identical)
deltatexts than the bisect algorithm.
Looks like the function wasn't ever used since its introduction in
083571f47ff6, since setColor() below always used "rgb(255, 255, 255)" notation
which doesn't need hex digits.
In addition to taking a step towards getting an unreasonably large function
factored into smaller, more manageable functions, this will allow extensions
such as remotenames have more control over what pushes are allowed or not.
We need to call delayupdate again after writing to the changelog.
Otherwise the prechangegroup hook consumes the delayupdate subscription and
future hooks don't see the pending changes (see issue 4934 for more details).
Adds a test that triggers the prechangegroup hook before the pretxnchangegroup
hook and verifies that the output of pretxnchangegroup doesn't change.
Previously we would only include HG_PENDING in the hook args if the
transaction's writepending() actually wrote something. This is a bad criteria,
since it's possible that a previous call to writepending() wrote stuff and the
hooks want to still see that.
The solution is to always have hooks execute within the scope of the pending
changes by always putting HG_PENDING in the environment.
The SSH peer class accesses wireproto.commands[cmd] as part of encoding
command arguments. Previously, the wire protocol command was defined in
the clonebundles extension. If the client didn't have this extension
enabled (which it likely doesn't since it is meant as a server-side
extension), then clients attempting to clone via ssh:// would get a
crash due to a KeyError accessing wireproto.commands['clonebundles']
when cloning from a server that is advertising clone bundles.
Moving the definition of the wire protocol command to wireproto.py makes
this problem go away.
A side effect of this code move is servers will always respond to
"clonebundles" wire protocol command requests. This should be fine: the
server will return an empty response unless a clone bundles manifest
file is present and clients shouldn't call the command unless the server
is advertising the capability, which only happens if the clonebundles
extension is enabled and the manifest file exists.
As JSON string is known to be a unicode, we should try round-trip conversion
for localstr type. This patch tests localstr type explicitly because
encoding.fromlocal() may raise Abort for undecodable str, which is probably
not what we want. Maybe we can refactor json filter to use encoding module
more later.
Still "{desc|json}" can't round-trip because showdescription() modifies a
localstr object.
When using 'extdiff --patch' to check the changes in a rebase, 'precursors(x)'
evaluated to an empty set because I forgot the --hidden flag, so the other
revision was used as the replacement for the empty set. The result was the
patch for the other revision was diffed against itself, and the tool saying
there were no differences. That's misleading since the expected diff args were
silently changed, so it's better to bail out.
The other uses of scmutil.revpair() are commands.diff and commands.status, and
it doesn't make sense to allow an empty revision there either. The code here
was suggested by Yuya Nishihara.
Since Python 2.5 str has new methods: partition and rpartition. They are more
specialized than the usual split and rsplit, and they sometimes convey the
intent of code better and also are a bit faster (faster than split/rsplit with
maxsplit specified). Let's use them in appropriate places for a small speedup.
Example performance (partition):
$ python -m timeit 'assert "apple|orange|banana".split("|")[0] == "apple"'
1000000 loops, best of 3: 0.376 usec per loop
$ python -m timeit 'assert "apple|orange|banana".split("|", 1)[0] == "apple"'
1000000 loops, best of 3: 0.327 usec per loop
$ python -m timeit 'assert "apple|orange|banana".partition("|")[0] == "apple"'
1000000 loops, best of 3: 0.214 usec per loop
Example performance (rpartition):
$ python -m timeit 'assert "apple|orange|banana".rsplit("|")[-1] == "banana"'
1000000 loops, best of 3: 0.372 usec per loop
$ python -m timeit 'assert "apple|orange|banana".rsplit("|", 1)[-1] == "banana"'
1000000 loops, best of 3: 0.332 usec per loop
$ python -m timeit 'assert "apple|orange|banana".rpartition("|")[-1] == "banana"'
1000000 loops, best of 3: 0.219 usec per loop
Since Python 2.5 str has new methods: partition and rpartition. They are more
specialized than the usual split and rsplit, and they sometimes convey the
intent of code better and also are a bit faster (faster than split/rsplit with
maxsplit specified). Let's use them in appropriate places for a small speedup.
Example performance (partition):
$ python -m timeit 'assert "apple|orange|banana".split("|")[0] == "apple"'
1000000 loops, best of 3: 0.376 usec per loop
$ python -m timeit 'assert "apple|orange|banana".split("|", 1)[0] == "apple"'
1000000 loops, best of 3: 0.327 usec per loop
$ python -m timeit 'assert "apple|orange|banana".partition("|")[0] == "apple"'
1000000 loops, best of 3: 0.214 usec per loop
Example performance (rpartition):
$ python -m timeit 'assert "apple|orange|banana".rsplit("|")[-1] == "banana"'
1000000 loops, best of 3: 0.372 usec per loop
$ python -m timeit 'assert "apple|orange|banana".rsplit("|", 1)[-1] == "banana"'
1000000 loops, best of 3: 0.332 usec per loop
$ python -m timeit 'assert "apple|orange|banana".rpartition("|")[-1] == "banana"'
1000000 loops, best of 3: 0.219 usec per loop
'repo.invalidate()' deletes 'filecache'-ed properties by
'filecache.__delete__()' below via 'delattr(unfiltered, k)'. But
cached objects are still kept in 'repo._filecache'.
def __delete__(self, obj):
try:
del obj.__dict__[self.name]
except KeyError:
raise AttributeError(self.name)
If 'repo' object is reused even after failure of command execution,
referring 'filecache'-ed property may reuse one kept in
'repo._filecache', even if reloading from a file is expected.
Executing command sequence on command server is a typical case of this
situation (e0a0f9ad3e4c also tried to fix this issue). For example:
1. start a command execution
2. 'changelog.delayupdate()' is invoked in a transaction scope
This replaces own 'opener' by '_divertopener()' for additional
accessing to '00changelog.i.a' (aka "pending file").
3. transaction is aborted, and command (1) execution is ended
After 'repo.invalidate()' at releasing store lock, changelog
object above (= 'opener' of it is still replaced) is deleted from
'repo.__dict__', but still kept in 'repo._filecache'.
4. start next command execution with same 'repo'
5. referring 'repo.changelog' may reuse changelog object kept in
'repo._filecache' according to timestamp of '00changelog.i'
'00changelog.i' is truncated at transaction failure (even though
this truncation is unintentional one, as described later), and
'st_mtime' of it is changed. But 'st_mtime' doesn't have enough
resolution to always detect this truncation, and invalid
changelog object kept in 'repo._filecache' is reused
occasionally.
Then, "No such file or directory" error occurs for
'00changelog.i.a', which is already removed at (3).
This patch discards objects in '_filecache' other than dirstate at
transaction failure.
Changes in 'invalidate()' can't be simplified by 'self._filecache =
{}', because 'invalidate()' should keep dirstate in 'self._filecache'
'repo.invalidate()' at "hg qpush" failure is removed in this patch,
because now it is redundant.
This patch doesn't make 'repo.invalidate()' always discard objects in
'_filecache', because 'repo.invalidate()' is invoked also at unlocking
store lock.
- "always discard objects in filecache at unlocking" may cause
serious performance problem for subsequent procedures at normal
execution
- but it is impossible to "discard objects in filecache at unlocking
only at failure", because 'releasefn' of lock can't know whether a
lock scope is terminated normally or not
BTW, using "with" statement described in PEP343 for lock may
resolve this ?
After this patch, truncation of '00changelog.i' still occurs at
transaction failure, even though newly added revisions exist only in
'00changelog.i.a' and size of '00changelog.i' isn't changed by this
truncation.
Updating 'st_mtime' of '00changelog.i' implied by this redundant
truncation also affects cache behavior as described above.
This will be fixed by dropping '00changelog.i' at aborting from the
list of files to be truncated in transaction.
We were assuming everything under 128 was printable ascii, but there are a lot
of control characters in that range that can't simply be included in json and
other targets. We forcibly encode everything under 32, because they are either
control char or oddly printable (like tab or line ending).
We also add the hypothesis-powered test that caught this.
Before bundle2, hook output from hook failures was prefixed with
"remote: ". Up to this point with bundle2, the output was converted to
the message to print in an Abort exception. This had 2 implications:
1) It was unclear whether an error message came from the local repo
or the remote
2) The exit code changed from 1 to 255
This patch changes the handling of error:abort bundle2 parts during push
to prefix the error message with "remote: ". This restores the old
behavior.
We still preserve the behavior of raising an Abort during bundle2
application failure. This is a regression from pre-bundle2 because the
exit code changed.
Because we no longer raise an Abort with the remote's message, we needed
to insert a message for the new Abort. So, I invented a new error
message for that. This is another change from pre-bundle2. However, I
like the new error message because it states unambiguously who aborted
the push failed, which I think is important for users so they can decide
what's next.
I have no idea where it came from, but my clone of Mercurial has an
empty filelog for `contrib/hgfixes/__init__.py` - it's *valid*, just
contains no nodes. Without this change, debugrevlog crashes with a
zero division error.
This behavior regressed as part of the paths API refactoring. Previous
behavior was to accept "default-push" without "default" defined. Current
behavior aborts with "default repository not configured!." This patch
restores the old behavior and adds test coverage for the scenario, which
was absent before.
Before this bundle repository were unable to work with compressed
bundle2. We use the same approach as with bundle1, we extract the
changegroup in uncompressed form into a temporary file.
We allow specifying the url to carry it to hooks. This gets us closer to
'bundle1.apply(...)' and will allow us to remove regressions in multiple place
where we forget to pass the url to hooks.
We allow specifying the source to carry it to hooks. This gets us closer to
'bundle1.apply(...)' and will allow us to remove regressions in multiple places
where we forget to pass the source to hooks.
There is a case where the intent is clear and the transaction is not optional. We
want to be able to alter that transaction in a wide and easy way. We cannot get
a unified '.apply(repo)' method for bundle1 and bundle2 yet because the api are
still a bit too far apart. But this is a good step forward to get the rc out.
This can happen when either 'hg resolve --all' is called or a driver-resolved
file is explicitly requested.
This is done as part of 'hg resolve --all' so that users still have a chance to
test their changes before committing them.
The exact semantics here are still to be decided. This does not impact any
non-experimental features.
Thanks to Pierre-Yves David for some advice about this behavior in particular,
and merge drivers in general.
We need to be careful about allowing --mark and --unmark to keep working -- we
don't want the user to be stuck in a weird state. The exact behavior here is
still to be decided, though.