Localrepo now supports the unbundle method of pushing changegroups. We
plan to use the unbundle call for bundle2 so it is important that all
peers supports it. The `peer.unbundle` and `peer.addchangegroup` code
path have small difference so cause some test output changes. None of those
changes seems problematic.
The `exchange` module now contains an `unbundle` function that holds the core
unbundle logic. The wire protocol keeps its own unbundle function. It enforces
wireprotocol-specific logic and then calls the extracted function.
This aims at implementing unbundle for localrepo.
We are going to refactor the unbundle function to have it working on
a local repository too. Having this function extracted will ease the process.
In the case of non-matching heads, the function now directly raises an
exception. The top level of the function is catching it.
When a changegroup is added by a push on a publishing server, we ensure they
are added as public. This is used to enforce publishing on server when the
client is not aware of phases. It also prevents race conditions where a reader
could see the changesets as draft before they get turned public by the client.
Finally, this save rounds trip as the client does not need additional request to
turn them public.
However, this logic was only enforced when the changegroup was from a "push"
source. And "push" is used for local pushes only. Wireprotocol push uses "serve"
as source since Mercurial 1.9. We now enforce this logic for both "push" and
"serve" sources.
One could note that this logic was mainly useful during wireprotocol exchanges.
So this code is finally put into good use, 9 versions after its introduction.
Python uses a C long (32 bits on Windows 64) rather than an ssize_t in
read(), and thus has a 2G size limit. Work around this by falling back
to reading one chunk at a time on overflow. This approximately doubles
our headroom until we run back into the size limit on single reads.
If backout generated no changes to commit, it showed wrong status, "changeset
<target> backs out changeset <target>", and raised TypeError with -v option.
This changes the return code to 1, which is the same as "hg commit" and
"hg rebase".
Before this patch, "contrib/check-code.py" can't detect these
problems, because the regexp pattern to detect "% inside _()" doesn't
suppose the case that format string consists of multiple string
components concatenated implicitly or explicitly,
This patch does below for that regexp pattern to detect "% inside _()"
problems in such case.
- put "+" into separator part ("[ \t\n]") for explicit concatenation
("...." + "...." style)
- enclose "component and separator" part by "(?:....)+" for
concatenation itself ("...." "...." or "...." + "....")
Before this patch, "contrib/check-code.py" can't detect these
problems, because the regexp pattern to detect "% inside _()" doesn't
suppose the case that the format string and "%" aren't placed in the
same line.
This patch replaces "\s" in that regexp pattern with "[ \t\n]" to
detect "% inside _()" problems in such case.
"[\s\n]" can't be used in this purpose, because "\s" is automatically
replaced with "[ \t]" by "_preparepats()" and "\s" in "[]" causes
nested "[]" unexpectedly.
Since changeset a8955c4d9ef5, "reposetup()" of each extensions is
invoked only on repositories enabling corresponded extensions.
This causes that largefiles specific interactions between the
repository enabling largefiles locally and remote (wire) peer fail,
because there is no way to know whether largefiles is enabled on the
remote repository behind the wire peer, and largefiles specific
"wireproto functions" are not given to any wire peers.
To avoid this problem, largefiles should be enabled in wider scope
than each repositories (e.g. user-wide "${HOME}/.hgrc").
This patch introduces "wirepeersetupfuncs" to setup wire peer by
extensions already enabled. Functions registered into
"wirepeersetupfuncs" are invoked for all wire peers.
This patch uses plain list instead of "util.hooks" for
"wirepeersetupfuncs", because the former allows to control order of
function invocation by order of extension enabling: it may be useful
for workaround of problems with combination of enabled extensions
Previously, if a template '{foo()}' was given, the buildfunc would not be able
to match it and hit a code path that would not return so it would error out
later in the templater stating that NoneType was not iterable. This patch makes
sure that a proper error is raised so that the user can be informed.
Tests have been updated.
When having ui.debugger=somedebugger in one's ~/.hgrc, this then
somedebugger would be imported for every hg command. With this patch,
this import only happens if the --debugger parameter is passed.
Changset 39a4d61c40d6 rewriting "hg.copystore()" with vfs uses
'dstbase + "/lock"' instead of "os.path.join()", because target files
given from "store.copyfiles()" already uses "/" as path separator
But in the repository using revlog format 0, "dstbase" becomes empty
("data" directory is located under ".hg" directly), and 'dstbase +
"/lock"' is treated as "/lock": in almost all cases, write access to
"/lock" causes "permission denied".
This patch uses "os.path.join()" to join path components which may be
empty in "hg.copystore()".
Before this patch, `hg commit --secret` was not getting propagated
correctly, and subrepos were not getting the commit in the secret
phase. The problem is that subrepos get their ui from the base repo's
baseui object and ignore the ui object passed on to them. This sets
and restores both ui objects with the appropriate option.
Before this patch, commit message (may be manually edited) for "commit
--amend" is never saved into ".hg/last-message.txt", because it uses
"localrepository.commitctx()" instead of "localrepository.commit()":
saving into ".hg/last-message.txt" is executed only in the latter.
This patch saves commit message for "commit --amend" into
".hg/last-message.txt" just after user editing.
This is the simplest implementation to fix on stable. Editing and
saving commit message for memctx should be centralized into the
framework like "localrepository.commit()" with "editor" argument or so
in the future.
Before this patch, manually edited commit message for "hg tag -e"
isn't saved into ".hg/last-message.txt" until it is saved by
"localrepository.savecommitmessage()" in "localrepository.commit()".
This may lose such commit message, if unexpected exception is raised.
This patch saves manually edited commit message for "hg tag -e" into
".hg/last-message.txt" just after user editing. This patch doesn't
save the message specified by -m option (-l is not supported for "hg
tag") as same as other commands.
This is the simplest implementation to fix on stable. Editing and
saving commit message should be centralized into the framework of
"localrepository.commit()" with "editor" argument in the future.
Before this patch, "localrepository.commit()" invokes specified
"editor" to edit commit message manually, and saves it after checking
sub-repositories.
This may lose manually edited commit message, if unexpected exception
is raised while checking (or commiting recursively) sub-repositories.
This patch saves manually edited commit message as soon as possible.
Before this patch, "hg commit --amend --secret" doesn't create new
amend changeset as secret, even though the internal function
"commitfunc()" passed to "cmdutil.amend()" make "phases.new-commit"
configuration as "secret" temporarily.
"cmdutil.amend()" uses specified "commitfunc" only for temporary amend
commit, and creates the final amend commit changeset by
"localrepository.commitctx()" directly with memctx.
This patch creates new amend changeset as secret correctly for
"--secret" option, by changing "phases.new-commit" configuration
temporarily before "localrepository.commitctx()".
Changeset 83ff877959a6 (released with 2.8.1) fixed "recursively
evaluate string literals as templates" problem (issue4102) by moving
the location of "string-escape"-ing from "tokenizer()" to
"compiletemplate()".
But some parts in template expressions below are not processed by
"compiletemplate()", and it may cause unexpected result.
- 'expr' of 'if(expr, then, else)'
- 'expr's of 'ifeq(expr, expr, then, else)'
- 'sep' of 'join(list, sep)'
- 'text' and 'style' of 'rstdoc(text, style)'
- 'text' and 'chars' of 'strip(text, chars)'
- 'pat' and 'repl' of 'sub(pat, repl, expr)'
For example, '\n' of "{join(extras, '\n')}" is not "string-escape"-ed
and treated as a literal '\n'. This breaks "Display the contents of
the 'extra' field, one per line" example in "hg help templates".
Just "string-escape"-ing on each parts above may not work correctly,
because inside expression of nested ones already applies
"string-escape" on string literals. For example:
- "{join(files, '\n')}" doesn't return "string-escape"-ed string, but
- "{join(files, if(branch, '\n', '\n'))}" does
To fix this problem, this patch does:
- introduce "rawstring" token and "runrawstring" method to handle
strings not to be "string-escape"-ed correctly, and
- make "runstring" method return "string-escape"-ed string, and
delay "string-escape"-ing until evaluation
This patch invokes "compiletemplate()" with "strtoken=exp[0]" in
"gettemplate()", because "exp[1]" is not yet evaluated. This code path
is tested via mapping ("expr % '{template}'").
In the other hand, this patch invokes it with "strtoken='rawstring'"
in "_evalifliteral()", because "t" is the result of "arg" evaluation
and it should be "string-escape"-ed if "arg" is "string" expression.
This patch doesn't test "string-escape"-ing on 'expr' of 'if(expr,
then, else)', because it doesn't affect the result.
Templating syntax allows nested expression to be specified as parts
below, but they are evaluated as a generator and don't work correctly.
- 'sep' of 'join(list, sep)'
- 'text' and 'chars' of 'strip(text, chars)'
In the former case, 'sep' returns expected string only for the first
separation, and empty one for the second or later, because the
generator has only one element.
In the latter case, templating is aborted by exception, because the
generator doesn't have 'strip()' method (as 'text') and can't be
passed as the argument to 'str.strip()' (as 'chars').
This patch applies "stringify()" on these sub expression to get string
correctly.
Changeset c84f81c3e120 (released with 2.8.1) fixed "recursively
evaluate string literals as templates" problem (issue4103) by
introducing "_evalifliteral()".
But some parts in template expressions below are still processed by
the combination of "compiletemplate()" and "runtemplate()", and may
cause same problem unexpectedly.
- 'init' and 'hang' of 'fill(text, width, init, hang)'
- 'expr' of 'sub(pat, repl, expr)'
- 'label' of 'label(label, expr)'
This patch processes them by "_evalifliteral()" instead of the
combination of "compiletemplate()" and "runtemplate()" to avoid
recursive evaluation of string literals completely.
We can use the "other" data from the recorded merge state instead of inferring
what the other could be from working copy parent. This will allow resolve to
fulfil its duty even when the second parent have been dropped.
Most direct benefit is fixing a regression in backout.
This data is mostly redundant with the "other" changeset node (+ other changeset
file path). However, more data never hurt.
The old format do not store it so this require some dancing to add and remove it
on demand.
When we have to fallback to the old version of the file, we infer the
"other" from current working directory parent. The same way it is currently done
in the resolve command. This is know to have shortcoming… but we cannot do
better from the data contained in the old file format. This is actually the
motivation to add this new file format.
We need to record the merge we were merging with. This solve multiple
bug with resolve when dropping the second parent after a merge. This
happen a lot when doing special merge (overriding the ancestor).
Backout, shelve, rebase, etc. can takes advantage of it.
This changeset just add the information in the merge state. We'll use it in the
resolve process in a later changeset.
This new format will allow us to address common bugs while doing special merge
(graft, backout, rebase…) and record user choice during conflict resolution.
The format is open so we can add more record for future usage.
This file still store hexified version of node to help human willing to debug
it by hand. The overhead or oversize are not expected be an issue.
The old format is still used. It will be written to disk along side the newer
format. And at parse time we detect if the data from old version of the
mergestate are different from the one in the new version file. If its the same,
both have most likely be written at the same time and you can trust the extra
data from the new file. If it differs, the old file have been written by an
older version of mercurial that did not knew about the new file. In that case we
use the content of the old file.
The format of the file is unchanged. But we are preparing a new file with a new
format that would be record based. So we change all the read/write logic to
handle a list of record until a very low level. This will allow simple plugging
of the new format in the current code.
When pulling changes from a compressed bundle Mercurial first uncompresses it
to a temporary file in .hg directory. This file will not be deleted unless
the bundlerepo (other) is explicitly closed.
This is similar to cleanup that occurs after incoming.
Until now, repositories did not provide any value for isdirectory in rows
produced for the index output, and thus isdirectory was generally evaluated as
None for each index entry representing a repository.
However, directories (visible when viewed with the descend and collapse
settings enabled) did provide a value of True and this value appeared to
persist in subsequent rows processed by the templater, causing isdirectory
tests in templates to produce incorrect results for index entries appearing
after directories.
This patch asserts the None value for repositories, thus erasing any such
persistent True values.
Compiling mercurial with the Sun Studio compiler gives seven copies of the
following warning on pathencode.c:
line 533: warning: initializer will be sign-extended: -1
Using explicit unsigned literals silences it.
Since afe2bc876c89, repo.cancopy() cannot be used to check if the repo is
a bundlerepository.
repo.url() should always have "scheme:", so it isn't necessary to parse
by util.url().
The certificate was updated in February 2014.
You can verify the certificate by using the Root CA certificate downloadable
from https://ssl.intevation.de/
The intermediate CA is sent by https://hg.intevation.org/
This fixes an issue introduced in 818c8992811a where, when disabling
demandimport while running hooks, it's inadvertently re-enabled even when
it was never enabled in the first place.
This doesn't affect normal command line usage of Mercurial; it only matters
when Mercurial is run with demandimport intentionally disabled.
Merge could overwrite untracked files and cause data loss.
Instead we now handle the 'local side removed file and has untracked file
instead' case as the 'other side added file that local has untracked' case:
FILE: untracked file exists
abort: untracked files in working directory differ from files in requested revision
It could perhaps make sense to create .orig files when overwriting, either
instead of aborting or when overwriting anyway because of force ... but for now
we stay consistent with similar cases.
A correct patch for this has existed in Python's BTS for 3 years
(http://bugs.python.org/issue9291), so waiting for it to be fixed
upstream is probably not a viable strategy. Instead, we add this
horrible hack to workaround the issue in existing copies of Python
2.4-2.7.
Before this changeset local clone of a repo with hidden changeset would include
then in the clone (why not) and turn them public (plain wrong). This happened
because the copy clone publish by dropping the phaseroot file entirely making
everything in the repo public (and therefore immune to obsolescence marker).
This changeset takes the simplest fix, we deny the copy clone in the case of hidden
changeset falling back to pull clone that will exclude them from the clone and
therefore not turning them public.
A smarter version of copy clone could be done, but I prefer to go for the
simplest solution first.
This fixes mistake of documentation about matching against directories
in "pattern.txt" introduced by b99923dc748f.
".hgignore" treats specified "glob:" pattern as same as one specified
for "-X" option: it can match against directories, too.
For reference, extra regexp string appended to specified pattern for
each types are listed below: see also "match.match()" and
"match._regex()" for detail.
============= ========== ===============
type cmdline -I/-X
============= ========== ===============
glob/relglob '$' '(?:/|$)'
path/relpath '(?:/|$)' '(?:/|$)'
re/relre (none) (none)
============= ========== ===============
Appending '$' means that the specified pattern should match against
only files.
Before this patch, shell alias may be executed by abbreviated command
name unexpectedly, even if abbreviated command name matches also
against the command provided by extension.
For example, "rebate" shell alias is executed by "hg reba", even if
rebase extension (= "rebase" command) is enabled. In this case, "hg
reba" should be aborted because of command name ambiguity.
This patch makes "_checkshellalias()" invoke "cmdutil.findcmd()"
always with "strict=True" (default value).
If abbreviated command name matches against only one shell alias even
after loading extensions, such shell alias will be executed via
"_parse()".
This patch doesn't remove "_checkshellalias()" invocation itself,
because it may prevent shell alias from loading extensions uselessly.
caps.remove('bundle2') was throwing an exception if bundle2 wasn't present in
the capabilities. This was causing test-static-http.t to hang. Let's just use
discard, so we don't get an exception.
The static http repository was doing his own filtering of capability ignoring
the filtering done in the local repo main class. This led to static http using
the current draft of bundle2. We now apply both.
The bundlecaps passed to exchange.getbundle were being dropped completely. We
should pass them on through to the changegroup.
This affected the remotefilelog extension, since it relies on those bundlecaps.
This changeset refactors the pull code to use a bundle2 when available. We keep
bundle2 disabled by default. The current code is not ready for prime time.
Ultimately we'll want to unify the API of `bunde10` and `bundle20` to have less
different code. But for now, testing the bundle2 exchange flow is an higher
priority.
This function can return a `HG10` or `HG20` bundle. It uses the `bundlecaps`
parameters to decides which one to return.
This is a distinct function from `changegroup.getbundle` for two reasons. First
the APIs of `bundle10` and `bundle20` are not compatible yet. The two functions
may be reunited in the future. Second `exchange.getbundle` will grow parameters
for all kinds of data (phases, obsmarkers, ...) so it's better to keep the
changegroup generation in its own function for now.
This function will be used it in the next changesets.
We use the `gettransaction` method approach already used for pull. We
need this because we do not know beforehand if the bundle needs a
transaction to be created. And (1) we do not want to create a
transaction for nothing. (2) Some bundle2 bundles may be read-only and
do not require any lock or transaction to be held.
The current changegroup format is put in a "changegroup" part and processed by
an appropriate handlers.
This is not production ready code, but let us start smoke testing.
Part handlers can now add records to the `bundleoperation` object. This can be
used to help other parts or to let the caller of the unbundling process react
to the results.
This object will hold all data and state gathered through the processing of a
bundle. This will allow:
- each handler to be aware of the things unbundled so far
- the caller to retrieve data about the execution
- bear useful object and logic (like repo, transaction)
- bear possible bundle2 reply triggered by the unbundling.
For now the object is very simple but it will grow at the same time as the
bundle2 implementation.
The unbundle can comes from multiple sources. (on disk file, peer, etc) and
(ultimately) of multiple type (bundle10, bundle20). The `processbundle` is no
longer in charge of creating the bundle.
No functionality has changed, since we've only extracted the code into its own
function. Now extensions can wrap _gethiddenblockers to provide their own
blocker without polluting bookmarks or local tags.
For repos with a large number of heads (including hidden heads), a stale tag
cache would cause computehidden to be drastically slower because of a the call
to repo.tags() (which would build the tag cache).
We actually don't need the tag cache for computehidden because we filter out
global tags. This patch replaces the call to repo.tags with readlocaltags so
as to avoid the tag cache.
This is a gratuitous code move aimed at reducing the localrepo bloatness.
The method had few callers, not enough to be kept in local repo.
The peer API stay unchanged.
This is a gratuitous code move aimed at reducing the localrepo bloatness.
The method had few callers, not enough to be kept in local repo.
The peer API remains unchanged.
This is a gratuitous code move aimed at reducing the localrepo bloatness.
The method had few callers, not enough to be kept in local repo.
The peer API remains unchanged.
The `pushoperation` object contains strictly more data the arguments currently
passed to `localrepo.checkpush` we pass the new object instead. This function is
used by MQ to abort push that includes MQ changesets.
Note: extension that may use this function will have to align.
Move move in the same direction we took for command line commands. each wire
protocol function will be decorated with its name and arguments.
Beside beside easier to read, this open the road to easily adding more metadata
(like security level or return type)
We already have multiple call function for multiple return type. The
`_decompress` function is only used for http and seems like a layer violation.
We drop it in favor of a new call type dedicated to "stream that may be useful to
compress".
The wireprotocole command use a small set of class as return value. We document
the meaning of each of them.
AS my knowledge of wire protocol is fairly shallow, the documentation can
probably use improvement. But this is a better than nothing.
With bundle2 we'll slowly move current steps into a single bundle2 building and
call (changegroup, phases, obsmarkers, bookmarks). But we need to keep the
old methods around for servers that do not support bundle2. So we introduce
this set of steps that remains to be done.
With bundle2 we'll have multiple code path susceptible to be responsible for
adding changeset during the pull. We move it to the pull operation to simplify
this the coming logic. The one doing the adding the changegroup will set the
value.
test-merge-types.t changes a bit in flag merging. It relied on the
implementation detail that 100% identical revlog entries are reused. The revlog
reuse did that fctx.ancestor() saw an ancestor where there really not was one.
Previously we would iterate over every item in the subset, checking if it was in
the provided args. This often meant iterating over every rev in the repo.
Now we iterate over the args provided, checking if they exist in the subset.
On a large repo this brings setting phase boundaries (which use this revset
roots(X:: - X::Y)) down from 0.8 seconds to 0.4 seconds.
The "roots((tip~100::) - (tip~100::tip))" revset in revsetbenchmarks shows it
going from 0.12s to 0.10s, so we should be able to catch regressions here in the
future.
This actually introduces a regression in 'roots(all())' (0.2s to 0.26s) since
we're now using spansets, which are slightly slower to do containment checks on.
I believe this trade off is worth it, since it makes the revset O(number of
args) instead of O(size of repo).
Previously revset._descendants would iterate over the entire subset (which is
often the entire repo) and test if each rev was in the descendants list. This is
really slow on large repos (3+ seconds).
Now we iterate over the descendants and test if they're in the subset.
This affects advancing and retracting the phase boundary (3.5 seconds down to
0.8 seconds, which is even faster than it was in 2.9). Also affects commands
that move the phase boundary (commit and rebase, presumably).
The new revsetbenchmark indicates an improvement from 0.2 to 0.12 seconds. So
future revset changes should be able to notice regressions.
I removed a bad test. It was recently added and tested '1:: and reverse(all())',
which has an amibiguous output direction. Previously it printed in reverse order,
because we iterated over the subset (the reverse part). Now it prints in normal
order because we iterate over the 1:: . Since the revset itself doesn't imply an
order, I removed the test.
When the bundle processing abort on unknown mandatory parts, we now makes sure
all the bundle content is read. This avoid leaving the communication channel in
an unrecoverable state.
Mandatory part cannot be ignored when unknown. We raise a simple KeyError
exception when this happen.
This is very early version of this logic, see inline comment for future
improvement lead.
We now have a function that interpret part content.
This is a version early version of this function. It'll see major changes in
scope and API in future development. As for previous I'm just focussing on
getting minimal logic setup. Refining will happen with real world usage.
We use the same approach that the other unpack, as function is given the struct
format and his both responsible for reading the right amount of data from the
header and unpack the struct.
This give use flexibility if we decide to change the size of something in the
format before the release.
Previously the fncache was cleaned up at read time by noticing when it was out
of sync. This caused writes to happen outside the scope of transactions and
could have caused race conditions. With this change, we'll keep the fncache
up-to-date as we go by removing old entries during repair.strip.
The fncache was not being properly invalidated each time the lock was taken, so
in theory it could contain old data from prior to the caller having the lock.
This changes it to be invalidated as soon as the lock is taken (same as all our
other caches).
Previously the fncache was written at lock.release time. This meant it was not
tracked by a transaction, and if an error occurred during the fncache write it
would fail to update the fncache, but would not rollback the transaction,
resulting in an fncache that was not in sync with the files on disk (which
causes verify to fail, and causes streaming clones to not copy all the revlogs).
This uses the new transaction backup mechanism to make the fncache transacted.
It also moves the fncache from being written at lock.release time, to being
written at transaction.close time.
This adds support for normal, non-append-only files in transactions. For
example, .hg/store/fncache and .hg/store/phaseroots should be written as part of
the transaction, but are not append only files.
This adds a journal.backupfiles along side the normal journal. This tracks which
files have been backed up as part of the transaction. transaction.addbackup()
creates a backup of the file (using a hardlink), which is later used to recover
in the event of the transaction failing.
Using a seperate journal allows the repository to still be used by older
versions of Mercurial. A future patch will use this functionality and add tests
for it.
Adds an optional onclose parameter to transactions that gets called just before
the transaction is committed. This allows things that build up data over the
course of the transaction (like the fncache) to commit their data.
Also adds onabort. It's not used, but will allow extensions to hook into onclose
and onabort to provide transaction support.
Streaming clones were writing to files outside of a transaction. Soon the
fncache will be written at transaction close time, so we need streaming clones
to be in a transaction.
The fncache could rewrite itself during a read operation if it noticed any
entries that were no longer on disk. This was problematic because it caused
Mercurial to perform write operations outside the scope of a lock or
transaction, which could interefere with any other pending writes.
This will be replaced in a future patch by logic that cleans up the fncache
as files are deleted during strips.
In the bare pull case we could add the same node multiple time to the
`pulloperation.pulledsubset`. Beside being a bit wrong this confused the new
revset implementation of `revset._revancestor` into giving bad result.
This changeset fix the pull operation part. The fix for the revset itself will
come in another changeset.
We add the ability to bundle and unbundle a payload in parts. The payload is the
actual binary data of the part. It is used to convey all the applicative data.
For now we stick to very simple implementation with all the data fit in single
chunk. This open the door to some bundle2 testing usage. This will be improved before
bundle2 get used for real. We need to be able to stream the payload in multiple
part to exchange any changegroup efficiently. This simple version will do for
now.
Bundling and unbundling are done in the same changeset because the test for
parts is less modular. However, the result is not too complex.
min([]) raise a ValueError, we do the same thing in smartset.min() and
smartset.max() for the sake of consistency.
The min/amax test are greatly improved in the process to prevent this familly
of regression
Since recent revset changes, revrange now return a smartset. This smart set
probably does not support indexing (_addset does not). This led to crash.
Instead when the smartset is ordered we use the `min` and `max` method of
smart set. Otherwise we turn is into a list and use indexing on it.
The tests have been updated to catch such regression.
This should correct an earlier couple of bad merges (5433856b2558 and
596960a4ad0d, now pruned) that accidentally brought in a change that had
been marked obsolete (244ac996a821).
Back in the time where repo.revs(...) returned a list, calling `len(...)` on the
result was quite common. We reinstall this on _addset.
There is absolutely no easy way to test this from the command line. The commands
using this in the evolve extension will eventually land into core.
Before this patch, internal function "display()" of "hg grep" is not
efficient for "-l"/"--files-with-matches", because loop is continued,
even after the first matching is found in the specified file.
This patch exits loop immediately, if matching is found for
"--files-with-matches".
In this case, "before is None" is equal to "opts.get('files_with_matches')".
Before this patch, internal function "display()" of "hg grep" stores
whether matching is already found or not into the dictionary
"filerevmatches" by "(fn, rev)" tuple as the key.
But this is redundant, because:
- "filerevmatches" is local variable of "display()", so each
"display()" invocations don't affect others
- both "fn" and "rev" (gotten from "ctx" argument) are never changed
in each "display()" invocations
Then, "filerevmatches" should have only one entry at most, and "(fn,
rev) in filerevmatches" should be equal to "found".
This patch uses "found" instead of "filerevmatches" examination for
efficiency.
Before this patch, to check whether the file in the specified revision
is binary or not, "util.binary()" is invoked via internal function
"binary()" of "hg grep" once per a line of "hg grep" output, even
though binary-ness is not changed in the same file.
This patch reuses the first "util.binary()" invocation result by
annotating internal function "binary()" with "@util.cachefunc".
Performance improvement measured by "hgperf grep -r 5e9e7af9ae01 vfs
mercurial/scmutil.py":
before this patch:
! wall 0.024000 comb 0.015600 user 0.015600 sys 0.000000 (best of 118)
after this patch:
! wall 0.023000 comb 0.015600 user 0.015600 sys 0.000000 (best of 123)
Status of recent(5e9e7af9ae01) "mercurial/scmutil.py":
# of lines: 919 (may affect cost of search)
# of bytes: 29633 (may affect cost of "util.binary()")
# of matches: 22 (may affect frequency of "util.binary()")
Before this patch, "util.cachefunc()" caches the value returned by the
specified function into dictionary "cache", even if the specified
function takes no arguments.
In such case, "cache" has at most one entry, and distinction between
entries in "cache" is meaningless.
This patch adds the code path to "cachefunc()" for the function taking
no arguments for efficiency: to store only one cached value, using
list "cache" is a little faster than using dictionary "cache".