A subsequent patch will refactor _chunks() and the calculation of the
offset will no longer occur in that function. Prepare by returning the
offset from _chunkraw().
We have a convention of using -c|-m|FILE elsewhere for reading from
revlogs. Use it for `hg perfrevlog`.
While I was here, I also added a docstring to document what this
command does, as "perfrevlog" is ambiguous.
As part of investigating performance improvements to revlog reading,
I needed a mechanism to measure every part of revlog reading so I knew
where time was spent and how effective optimizations were.
This patch implements a perf command for benchmarking the various
stages of reading a single revlog revision.
When executed against a manifest revision at the end of a 30,000+
long delta chain in mozilla-central, the command demonstrates that
~80% of time is spent in zlib decompression.
The old code only partially cleared the caches. Now that we have a
comprehensive method for wiping all caches, let's call it.
This appears to introduce a marginal regression in `hg perfmanifest`
on mozilla-central. This is good because the new result is more
accurate since caches aren't being used.
The /dev/null redirect was causing the following error:
The system cannot find the path specified.
Adjusting HGRCPATH as part of the command line causes the system to try to
execute 'HGRCPATH'.
Once we get a matcher down into manifestmerge, we can make narrowhg
work more easily and potentially let manifest.match().diff() do less
work in manifestmerge.
Previously, perfrevset called repo.revs(), which only returns integer
revisions. Many revset consumers call repo.set(), which returns
changectx instances. Or they obtain a context manually later.
Since obtaining changectx instances when evaluating revsets is common,
this patch adds support for benchmarking this use case.
While we added an if conditional for every benchmark loop, it
doesn't appear to matter since revset evaluation dwarfs the cost
of a single if.
Now, 'dirstate.write(tr)' delays writing in-memory changes out, if a
transaction is running.
This may cause treating this revision as "the first bad one" at
bisecting in some cases using external hook process inside transaction
scope, because some external hooks and editor process are still
invoked without HG_PENDING and pending changes aren't visible to them.
'dirstate.write()' callers below in localrepo.py explicitly use 'None'
as 'tr', because they can assume that no transaction is running:
- just before starting transaction
- at closing transaction, or
- at unlocking wlock
This change touches every module in which repository.sopener was being used, and
changes it for the equivalent repository.svfs.
It should now be possible to remove localrepo.sopener.
The recursive addremove operation occurs completely before the first subrepo is
committed. Only hg subrepos support the addremove operation at the moment- svn
and git subrepos will warn and abort the commit.
We use a `formatter` object in the perf extensions. This allow the use of
formatted output like json. To avoid adding logic to create a formatter and pass
it around to the timer function in every command, we add a `gettimer` function
in charge of returning a `timer` function as simple as before but embedding an
appropriate formatter.
This new `gettimer` function also return the formatter as it needs to be
explicitly closed at the end of the command.
example output:
$ hg --config ui.formatjson=True perfvolatilesets visible obsolete
[
{
"comb": 0.02,
"count": 126,
"sys": 0.0,
"title": "obsolete",
"user": 0.02,
"wall": 0.0199398994446
},
{
"comb": 0.02,
"count": 117,
"sys": 0.0,
"title": "visible",
"user": 0.02,
"wall": 0.0250301361084
}
]
With the new lazy revset implementation, we need to actually read all elements
to trigger all the computations. Otherwise a no-op if of course much faster than
the full work.
This is a step towards breaking an import cycle between revset and
repoview. Import cycles happened to work in Python 2 with implicit
relative imports, but breaks on Python 3 when we start using explicit
relative imports via 2to3 rewrite rules.
The annotate algorithm used a custom parents() function to try to reuse
filectx and filelogs. I simplified it a bit to rely more heavily on the
self.parents() which makes it work well with alternative filectx
implementations. I tested performance on a file with 5000+ revisions
but no renames, and on a file with 500 revisions repeating a series of
4 edits+renames and saw zero performance hit. In fact, it was reliably a
couple milliseconds faster now.
Added the perfannotate command to contrib/perf.py for future use.