And port "hg files" to test it.
As you can see, extra indent is necessary to port to this API. I don't think
we should switch every fm.formatter() call to "with" statement.
All actions but one actually have the same constraints when it comes to validate
the 'action.node' value. So we actually just add this code to a method that can
be overwritten in the one action where it matters.
The now unused 'contraints' related enum and class attribute will be cleaned up
in the next changeset.
Action has a method dedicated to verifying its validity. So we move code
related to constrains into that method. This requires a bit more context to the
'verify' method in the same fashion we were passing the 'prev' argument.
This is an extra step before we can simplify the constraint handling code
further.
It does not seem useful to convert to hex: it is an extra step and they are
longer strings. So we stick to node for the logic. We only convert to short hex
for error when needed. As a nice side effect this remove the explicit constant
usage in'[12:]'. This will also help moving the code around later as we just
have to access action.node.
An upcoming changeset will make the line where this variable is used
slightly too long. Other later changesets will clean that up further and makes
the variable unnecessary, so this is only temporary and it does seems useful to
put anything more complicate in place.
There does not seem to be a reason for this to be a method. So we initialise
the class attribute once and for all at creation time and drop the instance
method.
That method is just returning self.node and is never overridden. We just use
the attribute directly instead and get rid of the method.
This is the beginning of series to simplify and unify verification of constraints
for actions.
The 'journal' naming is already used by the transaction journal. Having an
unrelated group of file with such a close naming is confusing and error prone.
We rename the file used by the 'journal' extension to use 'namejournal' as the
extension track the location of various 'names'.
demandimport and setuptools and decorator (from ironpython) and
pygments leads to lots of fail.
If demandimport is disabled we should skip testing it...
My bzr does not have bzrlib.revisionspec.RevisionSpec,
and thus tests were failing because convert refused to believe in bzr,
but hghave without this change thought it was available.
There isn't a formal handshake protocol in the wire protocol. But
clients almost certainly need to perform particular actions before they
can communicate with a server optimally. So document what that is
so people understand what's going on at connection establishment time.
The Mercurial wire protocol is under-documented. This includes a lack
of source docstrings and comments as well as pages on the official
wiki.
This patch adds the beginnings of "internals" documentation on the
wire protocol.
The documentation should have nearly complete coverage on the
lower-level parts of the protocol, such as the different transport
mechanims, how commands and arguments are sent, capabilities, and,
of course, the commands themselves.
As part of writing this documentation, I discovered a number of
deficiencies in the protocol and bugs in the implementation. I've
started sending patches for some of the issues. I hope to send a lot
more.
This patch starts with the scaffolding for a new internals page.
Several fields are renamed to be consistent with the annotate command, which
doesn't mean the last call for the name unification [1]. Actually, I'd rather
rename line_number to linenumber, linenum, lineno or line, but I want to
port the grep command to formatter first.
[1]: https://www.mercurial-scm.org/wiki/GenericTemplatingPlan#Dictionary
I don't have any better name for the list of matched/unmatched texts, so
they are just called as "texts".
These columns should always be available in JSON or template outputs. The
"change" column is excluded because it has no useful data unless --all is
specified.
As preparation for formatter support, this and the next patch split
linestate.__iter__() into two functions, line scanner and displayer.
New code uses regexp.search(str, pos) in place of regexp.search(substr),
which appears to fix a bug of highlighting.
This does a fall-back check for style files or directories that are
in Mercurial's template path for user convenience.
We intentionally don't use this for the built-in coal style because we don't
want the style to mysteriously break if the working directory just
happens to have a file named "paper".
This should be extremely useful for helping users debug without having
to see their complete configuration.
Shell aliases do not get their expansion logged, because we don't look
and see if we're in a repo before we dive into the execution of a
shell alias. As a result, the ui object doesn't know where to log.
I've caught multiple extensions in the wild lying about being
'internal', so it's time to move the goalposts on people. Goalpost
moving will continue until third party extensions stop trying to
defeat the system.
We sometimes need to build nested items by formatter, but there was no
convenient way other than building and putting them manually by fm.data():
exts = []
for n, v in extensions:
fm.plain('%s %s\n' % (n, v))
exts.append({'name': n, 'ver': v})
fm.data(extensions=exts)
This should work for simple cases, but doing this would make it harder to
change the underlying data type for better templating support.
So this patch provides fm.nested(field), which returns new nested formatter
(or self if items aren't structured and just written to ui.) A nested formatter
stores items which will later be rendered by the parent formatter.
fn = fm.nested('extensions')
for n, v in extensions:
fn.startitem()
fn.write('name ver', '%s %s\n', n, v)
fn.end()
Nested items are directly exported to a template for now:
{extensions % "{name} {ver}\n"}
There's no {extensions} nor {join(extensions, sep)} yet. I have a plan for
them by extending fm.nested() API, but I want to revisit it after trying
out this API in the real world.
New converter classes will be reused by a nested formatter. See the next
patch for details.
This change is also good in that the default values are defined uniquely
by the baseformatter.
Previously, the SIGWINCH handler does not get cleared and if the commit
message editor also needs SIGWINCH handling (like vim), the two SIGWINCH
handlers (the editor's, ours) will have a race. And we may erase the
editor's screen content.
This patch restores SIGWINCH handler to address the above issue.
See the comment for a detailed explanation why.
Even though this is a bug, I've sent it to 'default' rather than 'stable'
because it isn't triggered in any code paths in stock Mercurial, just with the
merge driver included. For the same reason I haven't included any tests here --
the merge driver is getting a new test.
Profiling using statprof revealed a hotspot during changegroup
application calculating delta chain bases on generaldelta repos.
Essentially, revlog._addrevision() was performing a lot of redundant
work tracing the delta chain as part of determining when the chain
distance was acceptable. This was most pronounced when adding
revisions to manifests, which can have delta chains thousands of
revisions long.
There was a delta chain base cache on revlogs before, but it only
captured a single revision. This was acceptable before generaldelta,
when _addrevision would build deltas from the previous revision and
thus we'd pretty much guarantee a cache hit when resolving the delta
chain base on a subsequent _addrevision call. However, it isn't
suitable for generaldelta because parent revisions aren't necessarily
the last processed revision.
This patch converts the delta chain base cache to an LRU dict cache.
The cache can hold multiple entries, so generaldelta repos have a
higher chance of getting a cache hit.
The impact of this change when processing changegroup additions is
significant. On a generaldelta conversion of the "mozilla-unified"
repo (which contains heads of the main Firefox repositories in
chronological order - this means there are lots of transitions between
heads in revlog order), this change has the following impact when
performing an `hg unbundle` of an uncompressed bundle of the repo:
before: 5:42 CPU time
after: 4:34 CPU time
Most of this time is saved when applying the changelog and manifest
revlogs:
before: 2:30 CPU time
after: 1:17 CPU time
That nearly a 50% reduction in CPU time applying changesets and
manifests!
Applying a gzipped bundle of the same repo (effectively simulating a
`hg clone` over HTTP) showed a similar speedup:
before: 5:53 CPU time
after: 4:46 CPU time
Wall time improvements were basically the same as CPU time.
I didn't measure explicitly, but it feels like most of the time
is saved when processing manifests. This makes sense, as large
manifests tend to have very long delta chains and thus benefit the
most from this cache.
So, this change effectively makes changegroup application (which is
used by `hg unbundle`, `hg clone`, `hg pull`, `hg unshelve`, and
various other commands) significantly faster when delta chains are
long (which can happen on repos with large numbers of files and thus
large manifests).
In theory, this change can result in more memory utilization. However,
we're caching a dict of ints. At most we have 200 ints + Python object
overhead per revlog. And, the cache is really only populated when
performing read-heavy operations, such as adding changegroups or
scanning an individual revlog. For memory bloat to be an issue, we'd
need to scan/read several revisions from several revlogs all while
having active references to several revlogs. I don't think there are
many operations that do this, so I don't think memory bloat from the
cache will be an issue.
This is the first place where we'll start using manifestctx instances instead of
manifestdict. This will facilitate using different manifestctx implementations
in the future.
The file caches we're using to avoid reloading the manifest from disk everytime
has an annoying bug that causes the in memory structure to not be reloaded if
the mtime and the size haven't changed. This causes a breakage in the tests
because the manifestlog is not being reloaded after a commit+strip operation in
mq (the mtime is the same because it all happens in the same second, and the
resulting size is the same because we add 1 and remove 1). The only reason this
doesn't affect the manifest itself is because we touch it so often that we
had already reloaded it after the commit, but before the strip.
Once the entire manifest has migrated to manifestlog, we can get rid of these
properties, since then the manifestlog will be touched after the commit, but
before the strip, as well.