This patch avoids the loop for "match.files()" having always one
element in revset predicate "filelog()" for efficiency: "match" object
"m" is constructed with "[pat]" as "patterns" argument.
Before this patch, default kind of pattern for revset predicate
"contains()" is treated as the exact file path rooted at the root of
the repository. This decreases usability, because:
- all other predicates taking pattern argument (also "filelog()")
treat such pattern as the path rooted at the current working
directory
- "contains()" doesn't describe this difference in its help
- this difference may confuse users
for example, this prevents revset aliases from sharing same
argument between "contains()" and other predicates
This patch makes default kind of pattern for revset predicate
"contains()" be rooted at the current working directory.
This patch uses "pathutil.canonpath()" instead of creating "match"
object for efficiency.
This patch narrows scope of the variable "m" in the function for
revset predicate "contains()", because it is referred only in "else"
code path of "if not matchmod.patkind(pat)" examination.
Previously, when an obsolete changeset was bookmarked, successor changesets were not considered
when moving the bookmark forward. Now that a bare update will move to the tip most of the
successor changesets, we also update the bookmark logic to allow the bookmark to move with this
update.
Tests have been updated and keep issue4015 covered as well.
Previously, a bare update would ignore any successor changesets thus
potentially leaving you on an obsolete head. This happens commonly when there
is an old bookmark that hasn't been moved forward which is the motivating
reason for this patch series.
Now, we will check for successor changesets if two conditions hold: 1) we are
doing a bare update 2) *and* we are currently on an obsolete head.
If we are in this situation, then we calculate the branchtip of the successor
set and update to that changeset.
Tests coverage has been added.
In some case Backout silently succeeded to back out but left all the change
uncommitted. This may be confusing for user so this changeset add a note
reminding to commit. Other backout case already actively informs the user about
created commit.
Before the changeset the backout process was:
1) go to <target>
2) revert to <target> parent
3) update back to changeset we came from
The two update steps can takes a very long time to move back and forth unrelated
file change between <target> and current working directory.
The new process is just merging current working directory with the parent of
<target> using <target> as ancestor. This give the very same result but skip
the two updates. On big repo with a lot of files and changes that save a lots of
time (x20 for one week window).
The "merge" version (hg backout --merge) is still done with upgrades. We could
imagine using in memory commit to speed it up but this is another fish.
We drop iterrevs which are not needed anymore. The know head are never a
descendant of the updated set. It was possible with the old strip code. This
simplification make the code easier to read an update.
We never use the node of new revisions unless in the very specific case of
closed heads. So we can just use the revision number.
So give another handfull of percent speedup.
Was the behavior correct and the description wrong so it should be updated as
in this patch? Or should the code work as the documentation says?
Both ways could make some sense ... but none of them are obvious in all cases.
One place where it currently cause problems is when the current revision has
another branch head that is closer to tip but closed. 'hg rebase' refuses to
rebase to that as it only see the tip-most unclosed branch head which is the
current revision.
/me kind of likes named branches, but no so much how branch closing works ...
This is often very handy when hacking/debugging.
Calling util.debugstacktrace('hey') from a place in hg will give something like:
hey at:
./hg:38 in <module>
/home/user/hgsrc/mercurial/dispatch.py:28 in run
/home/user/hgsrc/mercurial/dispatch.py:65 in dispatch
/home/user/hgsrc/mercurial/dispatch.py:88 in _runcatch
/home/user/hgsrc/mercurial/dispatch.py:740 in _dispatch
/home/user/hgsrc/mercurial/dispatch.py:514 in runcommand
/home/user/hgsrc/mercurial/dispatch.py:830 in _runcommand
/home/user/hgsrc/mercurial/dispatch.py:801 in checkargs
/home/user/hgsrc/mercurial/dispatch.py:737 in <lambda>
/home/user/hgsrc/mercurial/util.py:472 in check
...
b33db384a66e not only introduced the 'bisect(current)' revset predicate, it
also changed how the 'current' revision is used in combination with --command.
The new behaviour might be ok for --noupdate where the working directory and
its revision shouldn't be used, but it also did that when --command is used to
run a command on the currently checked out revision then it could register the
test result on the wrong revision.
An example:
Before, bisect with --command could use the wrong revision when recording the
test result:
$ hg up -qr 0
$ hg bisect --command "python \"$TESTTMP/script.py\" and some parameters"
changeset 31:58c80a7c8a40: bad
abort: inconsistent state, 31:58c80a7c8a40 is good and bad
Now it works as before and as expected and uses the working directory revision
for the --command result:
$ hg up -qr 0
$ hg bisect --command "python \"$TESTTMP/script.py\" and some parameters"
changeset 0:b99c7b9c8e11: bad
...
The 'copies' method has no test coverage and calls copies.pathcopies with an
incorrect number of parameters and is thus (fortunately) not used. Kill it.
Any invocations of bookmarks other than a plain 'hg bookmarks' will likely
cause a write to the bookmark store. These should be guarded by the wlock.
The repo._bookmarks read should be similarly guarded by the wlock if we're
going to be subsequently writing to it.
Upcoming patches will acquire the wlock for write operations, such as make
inactive, but not read-only ones, such as list bookmarks. Separate out the
status messages so that the code paths can be separated.
Nodemap is not aware of filtering so we need to ask the changelog itself if a
node is known. This is probably a bit slower but such check does not dominated
discovery time. This is necessary if we want to run discovery on filtered repo.
The discovery is not yet ready for filtered repo. Pull was using filtered for
its discovery which is wrong. It worked by dumb luck because discovery mainly
use funtion that does not respect the filtering.
Trying to makes discovery work on filtered repo revealed this bug.
When all changesets in the local repo are either being pushed or remotly known,
we can take a fast path when bundling changeset because we are certain all local
deltas are computed againts base known remotely.
So we have a check to detect this situation, when we did a bare push and nothing
was excluded.
In a coming refactoring, the discovery will run on filtered view and the content
of `outgoing.excluded` will just include unserved (secret) changeset not filtered by the
repoview used to call push (usually "visible"). So we need to check if there is
both no excluded changeset and nothing filtered by the current repoview.
Before that changes, pulled revision that happend to be already known locally
(so, not actually added) was not taken into account when computing the new
common set between local and remote.
It appears that we already know the heads of the pulled set. It is in the
`rheads` variable, so we are just using it and everything is works fine.
We are dropping the, now useless, computation of `added` set in the process.
Moves the code that actually writes to a file to a separate function in
revlog.py. This allows extensions to intercept and use the data being written to
disk. For example, an extension might want to replicate these writes elsewhere.
When cloning the Mercurial repo on /dev/shm with --pull, I see about a 0.3% perf change.
It goes from 28.2 to 28.3 seconds.
The double-for form of list comprehensions gets particularly unreadable
when you throw in an 'if' condition. This expands the only remaining
instance of the double-for syntax in our codebase into a loop.
Reminder: a changeset is said "bumped" if it tries to obsolete a immutable
changeset.
The previous algorithm for computing bumped changeset was:
1) Get all public changesets
2) Find all they successors
3) Search for stuff that are eligible for being "bumped"
(mutable and non obsolete)
The entry size of this algorithm is `O(len(public))` which is mostly the same as
`O(len(repo))`. Even this this approach mean fewer obsolescence marker are
traveled, this is not very scalable.
The new algorithm is:
1) For each potential bumped changesets (non obsolete mutable)
2) iterate over precursors
3) if a precursors is public. changeset is bumped
We travel more obsolescence marker, but the entry size is much smaller since
the amount of potential bumped should remains mostly stable with time `O(1)`.
On some confidential gigantic repo this move bumped computation from 15.19s to
0.46s (×33 speedup…). On "smaller" repo (mercurial, cubicweb's review) no
significant gain were seen. The additional traversal of obsolescence marker is
probably probably counter balance the advantage of it.
Other optimisation could be done in the future (eg: sharing precursors cache
for divergence detection)
Detection of bumped changeset should use `allprecursors(<mutable>)` instead or
`allsuccessors(<immutable>)` so we need the all precursors function to exists.
Changeset aad678a92970 moved `subsettable` from `mercurial/repoview.py` to
`mercurial/branchmap.py`. This mean that `filtertable` and `subsettable` are no
longer next to each other. So we add a comment to remind people to update both.
parse() cannot be called at the same time because a parser object keeps its
states. This is no problem for command-line hg client, but it would cause
strange errors in multi-threaded hgweb.
Creating parser object is not too expensive.
original:
% python -m timeit -s 'from mercurial import revset' 'revset.parse("0::tip")'
100000 loops, best of 3: 11.3 usec per loop
thread-safe:
% python -m timeit -s 'from mercurial import revset' 'revset.parse("0::tip")'
100000 loops, best of 3: 13.1 usec per loop
Passing a non-string to parsers.parse_index2() causes Mercurial to crash
instead of raising a TypeError (found on Mac OS X 10.8.5, Python 2.7.6):
import mercurial.parsers as parsers
parsers.parse_index2(0, 0)
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 parsers.so 0x000000010e071c59 _index_clearcaches + 73 (parsers.c:644)
1 parsers.so 0x000000010e06f2d5 index_dealloc + 21 (parsers.c:1767)
2 parsers.so 0x000000010e074e3b parse_index2 + 347 (parsers.c:1891)
3 org.python.python 0x000000010dda8b17 PyEval_EvalFrameEx + 9911
This happens because when arguments of the wrong type are passed to
parsers.parse_index2(), indexType's initialization function index_init() in
parsers.c leaves the indexObject instance in a state that indexType's
destructor function index_dealloc() cannot handle.
This patch moves enough of the indexObject initialization code inside
index_init() from after the argument validation code to before it.
This way, when bad arguments are passed to index_init(), the destructor
doesn't crash and the existing code to raise a TypeError works. This
patch also adds a test to check that a TypeError is raised.
Previously, this required -f because we didn't consider obsolete changesets
(and their children ... or successors of those children, etc.). We now use
obsolete.foreground to calculate acceptable changesets when advancing the
bookmark.
Test coverage has been added.
Backslashes (\) in paths were encoded to %C5 when converting from url to
string. This does not look nice for windows paths. And it introduces many
problems when running tests on windows.
When trying to turn a draft changeset into a secret changeset, I was
told:
% hg phase -s .
cannot move 1 changesets to a more permissive phase, use --force
no phases changed
That message struck me as being backwards -- the secret phase feels
less permissive to me since it restricts the changesets from being
pushed.
We don't use the word "permissive" elsewhere, 'hg help phase' talks
about "lower phases" and "higher phases". I therefore reformulated the
error message to be
cannot move 1 changesets to a higher phase, use --force
That is not perfect either, but more in line with the help text. An
alternative could be
cannot move phase backwards for 1 changesets, use --force
which fits better with the help text for --force.
Before the templater got extended for nested expressions, it made
sense to decode string escapes across the whole string. Now we do it
on a piece by piece basis.
The search mode description can't be translated by itself, since
it's displayed as part of a template phrase (the "Assuming ..."
/ "Use ... instead" bits). Just drop the translation markers for
now, since the templates themselves currently do not support
translations.
Paths ending with \ will fail the verification introduced in 0bc0c17d663e when
checking out on Windows ... and if it didn't fail it would probably not do what
the user expected.
Lines with only a directive are not deleted anymore because they are detected
before comments are deleted by prunecomments().
addmargins() will be adapted later.
Forgets need to be in the beginning of the action list, same as removes. This
lets us avoid clashes in the dirstate where a directory is forgotten and a
file with the same name is added, or vice versa.
When aborting a rebase where tip-1 is public, rebase would fail to undo the merge
state. This caused unexpected dirstate parents and also caused unshelve to
become unabortable (since it uses rebase under the hood).
The problem was that rebase uses -2 as a marker rev, and when it checked for
immutableness during the abort, -2 got resolved to the second to last entry in
the phase cache.
Adds a test for the fix. Add exception to phase code to prevent this in the
future.
This is arguably a workaround, a better fix may be in the repo to
ensure that it won't list a file 'modified' unless there is a file
context for the previous version.
On Windows, only double quotation mark can quote command line
arguments.
So, this patch uses double quotation mark to quote command line
arguments in all examples of online help document.
This patch revises hint message from "for detail about" introduced by
changeset 49ed20ea8da0 to "for details about", to unify it with the
hint message introduced by proceeding patch.
This patch adds more detailed explanation about "--force" to online
help document of "hg push" to prevent novice users to execute "push
--force" easily without understanding about problems of multiple
branch heads in the repository.
"use push -f to force" in the hint at abortion of "hg push" may cause
novice users to execute "push -f" easily without understanding about
problems of multiple branch heads in the repository.
This patch hides description about "-f" in the hint, and leads into
seeing "hg help push" for details about pushing new heads.
Before this patch, python modules of each extensions can't import
another one in own extension by absolute name, because root modules of
each extensions are loaded with "hgext_" prefix.
For example, "import extroot.bar" in "extroot/foo.py" of "extroot"
extension fails, even though "import bar" in it succeeds.
Installing extensions into site-packages of python library path can
avoid this problem, but this solution is not reasonable in some cases:
using binary package of Mercurial on Windows, for example.
This patch retries to import with "hgext_" prefix after ImportError,
if the module in the extension may try to import another one in own
extension.
This patch doesn't change some "_import()"/"_origimport()" invocations
below, because ordinary extensions shouldn't cause such invocations.
- invocation of "_import()" when root module imports sub-module by
absolute path without "fromlist"
for example, "import a.b" in "a.__init__.py".
extensions are loaded with "hgext_" prefix, and this causes
execution of another (= fixed by this patch) code path.
- invocation of "_origimport()" when "level != -1" with "fromlist"
for example, importing after "from __future__ import
absolute_import" (level == 0), or "from . import b" or "from .a
import b" (0 < level),
for portability between python versions and environments,
extensions shouldn't cause "level != -1".
Before this patch, demandimport of Mercurial may fail to load external
libraries using "from __future__ import absolute_import": for example,
importing "foo" in "bar.baz" module will load "bar.foo" if it exists,
even though "absolute_import" is enabled in "bar.baz" module.
So, extensions for Mercurial can't use such external libraries.
This patch saves "level" of import request for on-demand module
loading in the future: default value of level is -1, and level is 0
when "absolute_import" is enabled.
"level" value is passed to built-in import function in
"_demandmod._load()" and it should load target module correctly.
This patch changes only one "_demandmod" construction case other than
cases below:
- construction in "_demandmod._load()"
this code path should be used only in relative sub-module
loading case
- constructions other than patched one in"_demandimport()"
these code paths shouldn't be used in "level != -1" case
hg update . (or equivalents) are effectively no-ops in just about all
circumstances. These sorts of updates can be especially common in a
bookmark-oriented workflow. This saves us a status check and a manifest
decompression, which means that on a repo with over 210,000 files, this brings
hg update . down from 2.5 seconds to 0.15.
There is one change in behavior: a file that was added, not committed, and then
deleted but not removed used to be removed from the dirstate. With this patch
it isn't. This is what causes the change in test-mq-qpush-exact.t. This seems
like it's enough of an edge case to not be worth handling.
The output of test-empty.t changes because those files are not yet created.
Before this patch, each feature setup functions for localrepository
class should examine whether corresponding extension is enabled or not
by themselves.
This patch invokes only feature setup functions defined in module of
enabled extensions, and it makes implementation of feature setup
functions easier and simpler.
Push is currently allowed to create a new head if there is a remote
bookmark that will be updated to point to the new head. If the
bookmark is not known remotely then push aborts, even if a -B argument
is about to push the bookmark. This change allows push to continue in
this case. This does not require a wireproto force.
This modifies slightly the behavior introduced in fcc482469a3c to allow
showextras to return a hybrid, rather than showlist. The example in the
template help file now executes and returns meaningful results.
Running perfmoonwalk on the Mercurial repo (with almost 20,000 changesets) on
Mac OS X with an SSD, before this change:
$ hg --config format.chunkcachesize=1024 perfmoonwalk
! wall 2.022021 comb 2.030000 user 1.970000 sys 0.060000 (best of 5)
(16,154 cache hits, 3,840 misses.)
$ hg --config format.chunkcachesize=4096 perfmoonwalk
! wall 1.901006 comb 1.900000 user 1.880000 sys 0.020000 (best of 6)
(19,003 hits, 991 misses.)
$ hg --config format.chunkcachesize=16384 perfmoonwalk
! wall 1.802775 comb 1.800000 user 1.800000 sys 0.000000 (best of 6)
(19,746 hits, 248 misses.)
$ hg --config format.chunkcachesize=32768 perfmoonwalk
! wall 1.818545 comb 1.810000 user 1.810000 sys 0.000000 (best of 6)
(19,870 hits, 124 misses.)
$ hg --config format.chunkcachesize=65536 perfmoonwalk
! wall 1.801350 comb 1.810000 user 1.800000 sys 0.010000 (best of 6)
(19,932 hits, 62 misses.)
$ hg --config format.chunkcachesize=131072 perfmoonwalk
! wall 1.805879 comb 1.820000 user 1.810000 sys 0.010000 (best of 6)
(19,963 hits, 31 misses.)
We may want to change the default size in the future based on testing and
user feedback.
When reading a revlog chunk, instead of reading up to 64 KB ahead of the
request offset and caching that, this change caches a fixed window before
and after the requested data that falls on 64 KB boundaries. This increases
cache hits when reading revlogs backwards.
Running perfmoonwalk on the Mercurial repo (with almost 20,000 changesets) on
Mac OS X with an SSD, before this change:
$ hg perfmoonwalk
! wall 2.307994 comb 2.310000 user 2.120000 sys 0.190000 (best of 5)
(Each run has 10,668 cache hits and 9,304 misses.)
After this change:
$ hg perfmoonwalk
! wall 1.814117 comb 1.810000 user 1.810000 sys 0.000000 (best of 6)
(19,931 cache hits, 62 misses.)
On a busy NFS share, before this change:
$ hg perfmoonwalk
! wall 17.000034 comb 4.100000 user 3.270000 sys 0.830000 (best of 3)
After:
$ hg perfmoonwalk
! wall 1.746115 comb 1.670000 user 1.660000 sys 0.010000 (best of 5)
Before this patch, phase of newly created commit is determined by
"phases.new-commit" configuration regardless of phase of state in each
subrepositories.
For example, this may cause the "public" revision in the parent
repository referring the "secret" one in subrepository.
This patch checks phase of state in each subrepositories before
committing in the parent, and aborts or changes phase of newly created
commit if subrepositories have more restricted phase than the parent.
This patch uses "follow" as default value of "phases.checksubrepos"
configuration, because it can keep consistency between phases of the
parent and subrepositories without breaking existing tool chains.
This change causes an informative ImportError to be raised when importing
the extension module parsers if the minor version of the currently-running
Python interpreter doesn't match that of the Python that was used when
compiling the extension module. Here is an example of what the new error
looks like:
Traceback (most recent call last):
File "test.py", line 1, in <module>
import mercurial.parsers
ImportError: Python minor version mismatch: The Mercurial extension
modules were compiled with Python 2.7.6, but Mercurial is currently using
Python with sys.hexversion=33883888: Python 2.5.6
(r256:88840, Nov 18 2012, 05:37:10)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))]
at: /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/
Python.app/Contents/MacOS/Python
The reason for raising an error in this scenario is that Python's C API
is known not to be compatible from minor version to minor version, even
if sys.api_version is the same. See for example this Python bug report
about incompatibilities between 2.5 and 2.6+:
http://bugs.python.org/issue8118
These incompatibilities can cause Mercurial to break in mysterious,
unforeseen ways. For example, when Mercurial compiled with Python 2.7 was
run with 2.5, the following crash occurred when running "hg status":
http://bz.selenic.com/show_bug.cgi?id=4110
After this crash was fixed, running with Python 2.5 no longer crashes, but
the following puzzling behavior still occurs:
$ hg status
...
File ".../mercurial/changelog.py", line 123, in __init__
revlog.revlog.__init__(self, opener, "00changelog.i")
File ".../mercurial/revlog.py", line 251, in __init__
d = self._io.parseindex(i, self._inline)
File ".../mercurial/revlog.py", line 158, in parseindex
index, cache = parsers.parse_index2(data, inline)
TypeError: data is not a string
which can be reproduced more simply with:
import mercurial.parsers as parsers
parsers.parse_index2("", True)
Both the crash and the TypeError occurred because the Python C API's
PyString_Check returns the wrong value when the C header files from
Python 2.7 are run with Python 2.5. This is an example of an
incompatibility of the sort mentioned in the Python bug report above.
Failing fast with an informative error message will result in a better
user experience in cases like the above. The information in the ImportError
will also simplify troubleshooting for those on Mercurial mailing lists,
the bug tracker, etc.
This patch only adds the version check to parsers.c, which is sufficient
to affect command-line commands like "hg status" and "hg summary".
An idea for a future improvement is to move the version-checking C code
to a more central location, and have it run when importing all
Mercurial extension modules and not just parsers.c.
When creating patches modifying binary files using "git format-patch",
git creates 'literal' and 'delta' hunks. Mercurial currently supports
'literal' hunks only, which makes it impossible to import patches with
'delta' hunks.
This changeset adds support for 'delta' hunks. It is a reimplementation
of patch-delta.c from git :
http://git.kernel.org/cgit/git/git.git/tree/patch-delta.c
Previously, directories were added with the trailing slash and, if there was
only one completion, then another ambiguous entry was created using '.', as
follows:
$ hg rm mer<TAB>
mercurial/./ mercurial//
This was added in bc559aff745c (though, some logic existed before that) to work
around bash completion adding a space after the sole entry because we treated
directories and files the same. We no longer do that now so we remove this
unneeded code.
Tests have been updated to match this new behavior.
Some debuggers, such as ipdb, load escape codes and color codes even when later
turned off. This will affect scripts that do simple parsing and can't handle
escape codes. Therefore, we only load a custom debugger if ui.plain() is false.
When try to compile on x64 OS X, I get this warning:
mercurial/parsers.c:931:27: warning: implicit conversion loses integer precision
: 'long' to 'int' [-Wshorten-64-to-32]
? 4 : self->raw_length / 2;
The patch verifies if value of self->raw_length falls bellow INT_MAX; if not,
it raises the ValueError exception.
If value of self->raw_length is greater than 4, it's casted to int type, to
eliminate the warning.
There are currently two different ways we can have no active bookmark:
.hg/bookmarks.current being missing and it being an empty file. This patch and
upcoming ones make an empty file the only way to represent no active bookmarks.
This is the right choice because it matches the state that a new repository
without bookmarks will be in.
This patch makes "lock.lock.__init__()" take both vfs and lock file
path relative to vfs, instead of absolute path to lock file.
This allows lock file to be accessed via vfs.
This patch rewrites "copystore()" with vfs, because succeeding patch
requires "lock.lock()" invocation with vfs.
This patch uses 'dstbase + "/lock"' instead of "join()" with both
elements even on Windows environment but it should be reasonable,
because target files given from "store.copyfiles()" already uses "/"
as path separator.
"util.copyfiles()" between two vfs-s may have to be rewritten in the
future.
Before this patch, "localrepo.py" has many methods defining local
variable "lock", even though it imports "lock" module as "lock". This
ambiguity decreases readability.
Before this patch, unlink target file is once opened before unlinking,
because "opener" before vfs migration doesn't have "unlink()"
function.
This patch uses "vfs.unlink()" instead of "open()" and "fp.name".
Copying a repos local configuration to another repo is a bad idea because the
2nd repo gets the configuration of the first. Prevent this by really calling
repo.baseui.copy when repo.ui.copy is called.
This requires some changes in commandserver which needs to clone repo.ui for
rejecting temporary changes.
This patch has its roots back in the topic "repo isolation" around 68ae3063a47d
and was suggested by mpm.
The {branches} keyword dates to pre-1.0 Mercurial's tag-like branch
scheme which allowed changesets to be on multiple branches. This is
the last visible vestige of that scheme, users should instead be using
{branch}, possibly with if().
During a commit amend there are 4 manifests being handled:
- original commit
- temporary commit
- amended commit
- merge base
This causes a manifest cache miss which hurts perf on large repos. On a large
repo, this fix causes amend to go from 6 seconds to 5.5 seconds.
The previous revlog strip computation would walk every rev in the revlog, from
the bottom to the top. Since we're usually stripping only the top few revs of
the revlog, this was needlessly expensive on large repos.
The new algorithm walks the exact number of revs that will be stripped, thus
making the operation not dependent on the number of revs in the repo.
This makes amend on a large repo go from 8.7 seconds to 6 seconds.
When computing the commonmissing, it greedily computes the entire set
immediately. On a large repo where the majority of history is irrelevant, this
causes a significant slow down.
Replacing it with a lazy set makes amend go from 11 seconds to 8.7 seconds.
Upcoming patches will allow the file cache to watch over multiple files, and
call the decorated function again if any of the files change.
The particular use case for this is the bookmark store, which needs to be
invalidated if either .hg/bookmarks or .hg/bookmarks.current changes. (This
doesn't currently happen, which is a bug. This bug will also be fixed in
upcoming patches.)
This is a step towards breaking an import cycle between revset and
repoview. Import cycles happened to work in Python 2 with implicit
relative imports, but breaks on Python 3 when we start using explicit
relative imports via 2to3 rewrite rules.
Before this patch, duplicated obsolescence markers could slip into an
obstore if the bookmark was unknown locally and duplicated in the
incoming obsolescence stream.
Existing duplicate markers will not be automatically removed but
they'll stop propagating. Having a few duplicated markers is harmless
and people have been warned evolution is <blink>experimental</blink>
anyway.
This patch adds "updateremote()", which uses "compare()" to compare
bookmarks between the local and the remote repositories, to replace
pushing local bookmarks in "localrepository.push()".
This patch adds "pushtoremote()", which uses "compare()" to compare
bookmarks between the local and the remote repositories, to replace
pushing local bookmarks in "commands.push()".
To update entries in bmstore "localmarks", this patch uses
"bin(changesetid)" instead of "repo[changesetid].node()" used in
original "updatefromremote()" implementation, because the former is
cheaper than the latter.
This is the same thing which was done for changelog earlier, and it doesn't
affect performance at all. This change will make it possible to get the first
entry of the next page easily without computing the list twice.
Before this patch, almost all of check code in
"casecollisionauditor.__call__()" is executed, even if specified
filename is already checked, because "f in self._newfiles" is examined
lastly.
In addition to it, adding "fl" to "self._loweredfiles" and "f" to
"self._newfiles" are also redundant in such case.
This patch checks "f in self._newfiles" first, and returns immediately
to avoid execution of check code for efficiency.
dirstate.walk will return unknown files that were explicitly requested, even
if listunknown is false. There's no point in dropping these files on the
floor in dirstate.status.
This has no effect on any current callers, because all of them assume the
unknown list is empty and ignore it. Future callers may find it useful,
though.
When a user reaches the end of repo history with graph view, it (unsuccessfully)
tried to load more entries. This patch disables further loading attempts.
This patch gets stat object of target path by "vfs.lstat()", and
examines stat object to know the type of it. This follows the way in
"workingctx.add()".
This should be cheaper than original implementation invoking
"lexists()", "isfile()" and "islink()".
This patch also changes paths added to "rejected" list from full path
(referred by "p") to relative one (referred by "f"), when type of
target path is neither file nor symlink.
This change should be reasonable, because the path added to "rejected"
list is relative one, when "OSError" is raised at "lstat()".
Just invoking "os.fstat()" with "file.fileno()" doesn't require non
ANSI file API, because filename is not used for invocation of
"os.fstat()".
But "util.fstat()" should invoke "os.stat()" with "fp.name", if file
object doesn't have "fileno()" method for portability, and "fp.name"
may cause invocation of non ANSI file API.
So, this patch makes the constructor of appender class invoke
"util.fstat()" via vfs, to encapsulate filename handling.
The changegroup hook runs when the repo lock is released, but it's possible that
multiple transactions have happened during that single lock and therefore the
commits the hook was for may be gone. This is the case in the shelve extension
where it adds a commit and strips it in the same lock but different
transactions (which results in warning messages during unshelve on hgsubversion
repos).
A real fix would be to attach the hook to the transaction instead, but that
might have unknown consequences. Since we're this close to code-freeze, this fix
just prevents the hook from running if the commit disappeared.
There is a potential race here, which I suspect I've spotted in the wild, where
something reads the pid file after the parent exits but before the child has
had a chance to write to it. Moving writing the file to the parent causes this
to no longer be an issue.
The headers attribute is not initialized in certain error situations
(e.g. http 400 bad request). Check for self.headers before we attempt
to access it.
A `unfilteredpropertycache` is a kind of `propertycache` used on `localrepo` to
unsure it will always be run against unfiltered repo and stored only once.
As the cached value is never stored in the repoview instance, the descriptor
will always be called. Before this patch such calls always result in a call to
the `__get__` method of the `propertycache` on the unfiltered repo. That was
recomputing a new value on every access through a repoview.
We can't prevent the repoview's `unfilteredpropertycache` to get called on every
access. In that case the new code makes a standard attribute access to the
property. If a value is cached it will be used.
The `propertycache` test file have been augmented with test about this issue.
Propertycache used standard attribute assignment. In the repoview case, this
assignment was forwarded to the unfiltered repo. This result in:
(1) unfiltered repo got a potentially wrong cache value,
(2) repoview never reused the cached value.
This patch replaces the standard attribute assignment by an assignment to
`objc.__dict__` which will bypass the `repoview.__setattr__`. This will not
affects other `propertycache` users and it is actually closer to the semantic we
need.
The interaction of `propertycache` and `repoview` are now tested in a python
test file.
Before this patch, "hg help -k KEYWORD" fails, if there is the
extension of which name includes ".", because "extensions.load()"
invoked from "help.topicmatch()" fails to look such extension up, even
though it is already loaded in.
"help.topicmatch()" invokes "extensions.load()" with the name gotten
from "extensions.enabled()". The former expects full name of extension
(= key in '[extensions]' section), but the latter returns names
shortened by "split('.')[-1]". This difference causes failure of
looking extension up.
This patch adds "shortname" argument to "extensions.enabled()" to make
it return shortened names only if it is True. "help.topicmatch()"
turns it off to get full name of extensions.
Then, this patch shortens full name of extensions by "split('.')[-1]"
for showing them in the list of extensions.
Shortening is also applied on names gotten from
"extensions.disabled()" but harmless, because it returns only
extensions directly under "hgext" and their names should not include
".".
Previously basecache was incorrectly initialized before adding the first
revision from a changegroup. Basecache value influences when full revisions are
stored in revlog (when using generaldelta). As a result it was possible to
generate a generaldelta-revlog that could be bigger by arbitrary factor than its
non-generaldelta equivalent.
Python docs are a little unclear, but mpm reports reading the OpenSSL
source code shows that PROTOCOL_SSLv23 allows TLS whereas
PROTOCOL_SSLv3 does not.
Somewhere before 2.7, a change [82beb9b16505] was committed that
entailed a large performance regression when bundling (and therefore
remote cloning) repositories. For each file in the repository, it would
recompute the set of needed changesets even though it is the same for
all files. This computation would dominate bundle runtimes according to
profiler output (by 10x or more).
Some changesets can be wrongly reported as matched by this predicate
due to searching in a string joined with spaces and not individually.
A test case added, which fails without this fix.
Before this patch, tag overwriting history is not written into tag
cache file ".hg/cache/tags".
This may give higher priority to local tag than global one, even if
the former is overwritten by the latter, because tag overwriting
history is used to compare priorities of them (as "rank").
In such cases, "hg tags" invocations using tag cache file shows
incorrect tag information.
This patch writes tag overwriting history also into tag cache file.
This makes `hg pull --update` behave the same wrt the active bookmark as
`hg pull && hg update` does as of 13ea5e437ff8. A helper function,
bookmarks.calculateupdate, is added to prevent code duplication between
postincoming and update.
Make push -r/-B update only these bookmarks that point to pushed revisions
or their ancestors, so we can be sure that commit pointed by bookmark is
present in the remote reposiory. Previously push tried to update all shared
bookmarks.
With the follow_symlinks option, sshfs will successfully create links
while claiming it encountered an I/O error. In addition, depending on
the type of link, it may subsequently be impossible to delete the link
via sshfs. Our existing link to '.' will cause sshfs to think the link
is a directory, and thus cause unlink to give EISDIR. Links to
non-existent names or circular links will cause the created link to not even
be visible.
Thus, we need to create a new temporary file and link to that. We'll
still get a failure, but we'll be able to remove the link.
ninteresting indicates the number of non-zero elements in the interesting
array, not the number of elements in the final list. Since elements in
interesting can stand for more than one gca, limiting the number of results to
ninteresting is an error.
Tests for issue3984 are included.
The invariant this code tries to hold is that ninteresting is the number of
non-zero elements in the interesting array. interesting[nsp] is incremented at
the same time as interesting[sp] is decremented. So if interesting[nsp] was
previously 0, ninteresting shouldn't be decremented.
Given that N is maximum revision number in a repo, than if a revision with
number N-100n or N-100n+1 (for any integer n) is found with a hgweb search,
this revision is duplicated in search results.
We can't (easily) force SSL version on older Pythons, but on 2.6 and
later we can force SSLv3, which is safer and widely supported. This
also appears to work around a bug in IIS detailed in issue 3905.
This function looped over every node due to getElementsByTagName('*'), instead
of using selectors. In this patch we use querySelectorAll('.age') and process
only these nodes, which is much faster and also doesn't require extra condition.
Browser compatibility isn't sacrificed: IE 8+, FF 3.5+, Opera 10+.
repo.transaction always writes to stderr when a transaction aborts. In order to
be able to abort a transaction quietly (e.g shelve needs a temporary view on
the repo) we need to make the report level configurable.
Before this patch, pushing with --new-branch permits to create
multiple headed branch on the destination repository.
But permitting to create new branch should be different from
permitting to create multiple heads on branch.
This patch prevents from careless pushing multiple headed new branch,
and requires --force to push such branch forcibly.
Since repoview in 2.5 we do not make special call to `branchmap` when stripping.
We just recompute the branchmap from a lower subset that still has valid
branchmap. So I'm dropping this dead code.
It was unused. note how it is only extended if the list is empty. So it's always
empty at the end.
We could try to fix that, however this would part of the code is to be removed
in the next changeset as we do not run `branchmap` on truncated repo since
`repoview` in 2.5.
getbe32 and putbe32 need to behave differently on big-endian and little-endian
systems. On big-endian ones, they should be roughly equivalent to the identity
function with a cast, but on little-endian ones they should reverse the order
of the bytes. That is achieved by the original definition, but
__builtin_bswap32 and _byteswap_ulong, as the names suggest, swap bytes around
unconditionally.
There was no measurable performance improvement, so there's no point adding
extra complexity with even more ifdefs for endianncess.
When a subrepo has changed on the local and remote revisions, prompt the user
whether it wants to merge those subrepo revisions, keep the local revision or
keep the remote revision.
Up until now mercurial would always perform a merge on a subrepo that had
changed on the local and the remote revisions. This is often inconvenient. For
example:
- You may want to perform the actual subrepo merge after you have merged the
parent subrepo files.
- Some subrepos may be considered "read only", in the sense that you are not
supposed to add new revisions to them. In those cases "merging a subrepo" means
choosing which _existing_ revision you want to use on the merged revision. This
is often the case for subrepos that contain binary dependencies (such as DLLs,
etc).
This new prompt makes mercurial better cope with those common scenarios.
Notes:
- The default behavior (which is the one that is used when ui is not
interactive) remains unchanged (i.e. merge is the default action).
- This prompt will be shown even if the ui --tool flag is set.
- I don't know of a way to test the "keep local" and "keep remote" options (i.e.
to force the test to choose those options).
# HG changeset patch
# User Angel Ezquerra <angel.ezquerra@gmail.com>
# Date 1378420708 -7200
# Fri Sep 06 00:38:28 2013 +0200
# Node ID 2fb9cb0c7b26303ac3178b7739975e663075857d
# Parent 796d34e1b749b79834321ef1181ed8433a5515d9
merge: let the user choose to merge, keep local or keep remote subrepo revisions
When a subrepo has changed on the local and remote revisions, prompt the user
whether it wants to merge those subrepo revisions, keep the local revision or
keep the remote revision.
Up until now mercurial would always perform a merge on a subrepo that had
changed on the local and the remote revisions. This is often inconvenient. For
example:
- You may want to perform the actual subrepo merge after you have merged the
parent subrepo files.
- Some subrepos may be considered "read only", in the sense that you are not
supposed to add new revisions to them. In those cases "merging a subrepo" means
choosing which _existing_ revision you want to use on the merged revision. This
is often the case for subrepos that contain binary dependencies (such as DLLs,
etc).
This new prompt makes mercurial better cope with those common scenarios.
Notes:
- The default behavior (which is the one that is used when ui is not
interactive) remains unchanged (i.e. merge is the default action).
- This prompt will be shown even if the ui --tool flag is set.
- I don't know of a way to test the "keep local" and "keep remote" options (i.e.
to force the test to choose those options).
This causes httpclient to use the same SSL settings as the rest of
Mercurial, and adds an easy extension point for later modifications to
our ssl handling.
This lets us inject our own ssl.wrap_socket equivalent into
httpclient, which means that any changes we make to our ssl handling
can be *entirely* on our side without having to muck with httpclient,
which sounds appealing. For example, an extension could wrap
sslutil.ssl_wrap_socket with an api-compatible wrapper and then tweak
SSL settings more precisely or use GnuTLS instead of OpenSSL.
Prior to this change, we default to SSLv23, which is insecure because
it allows use of SSLv2. Unfortunately, there's no constant for OpenSSL
to let us use SSLv3 or TLS - we have to pick one or the other. We
expose a knob to revert to pre-TLS SSL for the user that has an
ancient server that lacks proper TLS support.
Previously, the error message for a dirty non-linear update was the same (and
relatively unhelpful) whether or not a rev was specified. This patch and an
upcoming one will introduce separate, more helpful hints.
While the default mode appends all the new entries to a container on the page,
the graph mode resizes canvas correctly, and repaints the graph to include
newly received data.
Namely, this allows the next page pointer to be not only revision hash given
in page code, but also any value computed from the value for previous page.
Before this patch, all localrepositories support same features,
because supported features are managed by the class variable
"supported" of "localrepository".
For example, "largefiles" feature provided by largefiles extension is
recognized as supported, by adding the feature name to "supported" of
"localrepository".
So, commands handling multiple repositories at a time like below
misunderstand that such features are supported also in repositories
not enabling corresponded extensions:
- clone/pull from or push to localhost
- recursive execution in subrepo tree
"reposetup()" can't be used to fix this problem, because it is invoked
after checking whether supported features satisfy ones required in the
target repository.
So, this patch adds the set object named as "featuresetupfuncs" to
"localrepository" to manage hook functions to setup supported features
of each repositories.
If any functions are added to "featuresetupfuncs", they are invoked,
and information about supported features is managed in each
repositories individually.
This patch also adds checking below:
- pull from localhost: whether features supported in the local(= dst)
repository satisfies ones required in the remote(= src)
- push to localhost: whether features supported in the remote(= dst)
repository satisfies ones required in the local(= src)
Managing supported features by the class variable means that there is
no difference of supported features between each instances of
"localrepository" in the same Python process, so such checking is not
needed before this patch.
Even with this patch, if intermediate bundlefile is used as pulling
source, pulling indirectly from the remote repository, which requires
features more than ones supported in the local, can't be prevented,
because bundlefile has no information about "required features" in it.
Before this patch, "extensions.extensions()" always lists up all
loaded extensions. So, commands handling multiple repositories at a
time like below enable extensions unexpectedly.
- clone from or push to localhost: extensions enabled only in the
source are enabled also in the destination
- pull from localhost: extensions enabled only in the destination
are enabled also in the source
- recursive execution in subrepo tree: extensions enabled only in
the parent or some of siblings in the tree are enabled also in
others
In addition to it, extensions disabled locally may be enabled
unexpectedly.
This patch checks whether each of extensions should be listed up or
not, if "ui" is specified to "extensions.extensions()", and invokes
"reposetup()" of each extensions only for repositories enabling it.
This makes it possible to make keyword search in case the search query also
specifies an exact revision (like '1234' or 'abcdef'), or a revset expression.
This function should be correctly called on a page, otherwise there is
no effect.
When called, it makes ajax requests for the next portion of changesets when the
user scrolls to the end. Also, when the monitor is high so that the
default amount of changesets isn't enough to fill it, multiple portions are
loaded if needed after the page load.
This function performs an asynchronous HTTP request and calls provided
callbacks:
- onstart: request is sent
- onsuccess: response is received
- onerror: some error occured
- oncomplete: response is fully processed and all other callbacks finished
Get the whole list of entries before rendering instead of using lazy evaluation.
This doesn't affect the performance for usual case when the entries are shown
anyway. When both entries and latestentry are used, this performs unnoticeably
faster, and for pages which use only latestentry (quite uncommon case) it
would be a bit slower.
This change will make it possible to get the first entry of the next page easily
without computing the list twice.
Running "hg log <pattern or directory>" on large repos took a very, very long
time because it first read ctx.files() for every commit before even starting to
process the results.
This change makes the ctx.files() check lazy, which makes the command start
producing results immediately.
memchr is usually smarter than a simple for loop. With gcc 4.4.6 and glibc 2.12
on x86-64, for a 20 MB, 200,000 file manifest, parse_manifest goes from 0.116
seconds to 0.095 seconds.
This mode is used when all the conditions are met:
- 'reverse(%s)' % query string can be parsed to a revset tree
- this tree has depth more than two, i.e. the query has some part of
revset syntax used
- the repo can be actually matched against this tree, i.e. it has only existent
function/operators and revisions/tags/bookmarks specified are correct
- no revset regexes are used in the query (strings which start with 're:')
- only functions explicitly marked as safe in revset.py are used in the query
Add several new tests for different parsing conditions and exception handling.
In case we don't have a cached text already, add the base rev to the list
passed to _chunks. In the cached case this also avoids unnecessarily preloading
the chunk for the cached rev.
We do this in a somewhat hacky way, relying on the fact that our sole caller
preloads the cache right before calling us. An upcoming patch will make this
more sensible.
For a 20 MB manifest with a delta chain of > 40k, perfmanifest goes from 0.49
seconds to 0.46.
Previously the length of data preloaded did not account for the interleaved io
contents. This meant that we'd sometimes have cache misses in _chunks despite
the preloading.
Having a correctly filled out cache will become essential in an upcoming patch.
This moves _chunkraw into the loop. Doing that improves revlog decompression --
in particular, manifest decompression -- significantly. For a 20 MB manifest
which is the result of a > 40k delta chain, hg perfmanifest improves from 0.55
seconds to 0.49 seconds.
Currently, we have basectx that serves as a common ancestor of all contexts. We
will now add a new class commitablectx that will inherit from basectx and will
serve as a common place for code that will be shared between mutable contexts,
e.g. workingctx and memctx.
Full walks are only necessary when the caller needs the list of clean files.
addremove by definition doesn't need them.
With this patch and an extension that produces a subset of results for
dirstate.walk when full is False, on a repository with over 200,000 files, hg
addremove drops from 2.8 seconds to 0.5.
Previously we'd place files written in the last second in the lookup set. This
can lead to pathological cases where a file always remains in the lookup set if
it gets modified before the next time status is run.
With this patch, only the mtime of those files is invalidated. This means that
if a file's size or mode changes, we can immediately declare it as modified
without needing to compare file contents.
On Windows, there are two ways symlinks can manifest themselves:
1. As placeholders: text files containing the symlink's target. This is what
usually happens with fresh clones on Windows.
2. With their dereferenced contents. This happens with clones accessed over NFS
or Samba.
In order to handle case 2, 28af3e0c54f0 made dirstate.status ignore all symlink
placeholders on Windows. It doesn't ignore symlinks in the lookup set, though,
since those don't have the link bit set. This is problematic because it
violates the invariant that `hg status` with every file in the normal set
produces the same output as `hg status` with every file in the lookup set.
With this change, symlink placeholders in the normal set are no longer ignored.
We instead rely on code in localrepo.status that uses heuristics to look for
suspect placeholders.
An upcoming patch will test this out by no longer adding files written in the
last second of an update to the lookup set.
A symlink's target should never be empty in normal use. Solaris and some BSDs
do allow empty symlinks to be created (with varying semantics on dereference),
but a symlink placeholder that started off as empty is either
- going to be empty, in which case ignoring it is fine, since it's unchanged, or
- going to not be empty, in which case this check is irrelevant.
This adds the ability to specify a config option, ui.debugger, to a custom pdb
module, such as ipdb, and have mercurial use that as its debugger. As long as
the value of ui.debugger is a loadable module with the set_trace and
post_mortem functions, then dispatch will be able to use the custom module.
Debugging _parseconfig is still available in the case of an error since it will
be caught with a default the value of pdb.post_mortem.
Previously, command line parsing of --config arguments was done in
_dispatch. This means that it takes place after activating the debugger. In an
upcoming patch, we will add a ui.debugger setting so we need to have this
parsing done before _runcatch.
This will be useful when reusing submerge() to improve the handling of subrepos
on mq.
# HG changeset patch
# User Angel Ezquerra <angel.ezquerra@gmail.com>
# Date 1377117244 -7200
# Wed Aug 21 22:34:04 2013 +0200
# Node ID 2defb5453f223c3027eb2f7788fbddd52bbb3352
# Parent a5c90acff5e61aae714ba6c9457d766c54b4f124
subrepo: make submerge() return the merged substate
This will be useful when reusing submerge() to improve the handling of subrepos
on mq.
This changes the behavior for queries which point at a revision directly,
now the output is consistent to other cases: it results in only this matched
revision shown, not the log starting with it.
A new test checks this behaviour and fails for the old one.
This changes makes clearer which arguments can a function depend on. Now all
the modified functions depend on the 'query' argument only, but future additions
will change it.
This makes possible to use unionrevlog class with subclasses of revlog that
override revlog's 'revision' and 'revdiff' methods. In particular this change
is necessary to implement manifest compression, as it allows extension to
replace manifest class and override 'revision' amd 'revdiff' methods there.
This makes possible to use bundlerevlog class with subclasses of revlog
that override revlog's 'revision' method. In particular this change is necessary
to implement manifest compression, as it allows extension to replace manifest
class and override 'revision' method there.
This change will allow revlog subclasses that override 'checkhash' method
to use custom strategy of computing nodeids without overriding 'addrevision'
method. In particular this change is necessary to implement manifest
compression.
Extract method that decides whether nodeid is correct for paricular revision
text and parent nodes. Having this method extracted will allow revlog
subclasses to implement custom way of computing nodes. In particular this
change is necessary to implement manifest compression.
After a day of hunting this defect, I'm now unable to reproduce the
bug without this patch applied. Regardless, this should fix the
problem I was observing with wireshark. Hopefully this fixes any
flakiness in the buildbot from http2.