Improve the lock waiting warning message by explicitly saying that a host and
process are holding the lock. This nudges confused new users in the direction
of investigating the other process instead of removing the lock.
And port "hg files" to test it.
As you can see, extra indent is necessary to port to this API. I don't think
we should switch every fm.formatter() call to "with" statement.
There isn't a formal handshake protocol in the wire protocol. But
clients almost certainly need to perform particular actions before they
can communicate with a server optimally. So document what that is
so people understand what's going on at connection establishment time.
The Mercurial wire protocol is under-documented. This includes a lack
of source docstrings and comments as well as pages on the official
wiki.
This patch adds the beginnings of "internals" documentation on the
wire protocol.
The documentation should have nearly complete coverage on the
lower-level parts of the protocol, such as the different transport
mechanims, how commands and arguments are sent, capabilities, and,
of course, the commands themselves.
As part of writing this documentation, I discovered a number of
deficiencies in the protocol and bugs in the implementation. I've
started sending patches for some of the issues. I hope to send a lot
more.
This patch starts with the scaffolding for a new internals page.
Several fields are renamed to be consistent with the annotate command, which
doesn't mean the last call for the name unification [1]. Actually, I'd rather
rename line_number to linenumber, linenum, lineno or line, but I want to
port the grep command to formatter first.
[1]: https://www.mercurial-scm.org/wiki/GenericTemplatingPlan#Dictionary
I don't have any better name for the list of matched/unmatched texts, so
they are just called as "texts".
These columns should always be available in JSON or template outputs. The
"change" column is excluded because it has no useful data unless --all is
specified.
As preparation for formatter support, this and the next patch split
linestate.__iter__() into two functions, line scanner and displayer.
New code uses regexp.search(str, pos) in place of regexp.search(substr),
which appears to fix a bug of highlighting.
This does a fall-back check for style files or directories that are
in Mercurial's template path for user convenience.
We intentionally don't use this for the built-in coal style because we don't
want the style to mysteriously break if the working directory just
happens to have a file named "paper".
This should be extremely useful for helping users debug without having
to see their complete configuration.
Shell aliases do not get their expansion logged, because we don't look
and see if we're in a repo before we dive into the execution of a
shell alias. As a result, the ui object doesn't know where to log.
I've caught multiple extensions in the wild lying about being
'internal', so it's time to move the goalposts on people. Goalpost
moving will continue until third party extensions stop trying to
defeat the system.
We sometimes need to build nested items by formatter, but there was no
convenient way other than building and putting them manually by fm.data():
exts = []
for n, v in extensions:
fm.plain('%s %s\n' % (n, v))
exts.append({'name': n, 'ver': v})
fm.data(extensions=exts)
This should work for simple cases, but doing this would make it harder to
change the underlying data type for better templating support.
So this patch provides fm.nested(field), which returns new nested formatter
(or self if items aren't structured and just written to ui.) A nested formatter
stores items which will later be rendered by the parent formatter.
fn = fm.nested('extensions')
for n, v in extensions:
fn.startitem()
fn.write('name ver', '%s %s\n', n, v)
fn.end()
Nested items are directly exported to a template for now:
{extensions % "{name} {ver}\n"}
There's no {extensions} nor {join(extensions, sep)} yet. I have a plan for
them by extending fm.nested() API, but I want to revisit it after trying
out this API in the real world.
New converter classes will be reused by a nested formatter. See the next
patch for details.
This change is also good in that the default values are defined uniquely
by the baseformatter.
Previously, the SIGWINCH handler does not get cleared and if the commit
message editor also needs SIGWINCH handling (like vim), the two SIGWINCH
handlers (the editor's, ours) will have a race. And we may erase the
editor's screen content.
This patch restores SIGWINCH handler to address the above issue.
See the comment for a detailed explanation why.
Even though this is a bug, I've sent it to 'default' rather than 'stable'
because it isn't triggered in any code paths in stock Mercurial, just with the
merge driver included. For the same reason I haven't included any tests here --
the merge driver is getting a new test.
Profiling using statprof revealed a hotspot during changegroup
application calculating delta chain bases on generaldelta repos.
Essentially, revlog._addrevision() was performing a lot of redundant
work tracing the delta chain as part of determining when the chain
distance was acceptable. This was most pronounced when adding
revisions to manifests, which can have delta chains thousands of
revisions long.
There was a delta chain base cache on revlogs before, but it only
captured a single revision. This was acceptable before generaldelta,
when _addrevision would build deltas from the previous revision and
thus we'd pretty much guarantee a cache hit when resolving the delta
chain base on a subsequent _addrevision call. However, it isn't
suitable for generaldelta because parent revisions aren't necessarily
the last processed revision.
This patch converts the delta chain base cache to an LRU dict cache.
The cache can hold multiple entries, so generaldelta repos have a
higher chance of getting a cache hit.
The impact of this change when processing changegroup additions is
significant. On a generaldelta conversion of the "mozilla-unified"
repo (which contains heads of the main Firefox repositories in
chronological order - this means there are lots of transitions between
heads in revlog order), this change has the following impact when
performing an `hg unbundle` of an uncompressed bundle of the repo:
before: 5:42 CPU time
after: 4:34 CPU time
Most of this time is saved when applying the changelog and manifest
revlogs:
before: 2:30 CPU time
after: 1:17 CPU time
That nearly a 50% reduction in CPU time applying changesets and
manifests!
Applying a gzipped bundle of the same repo (effectively simulating a
`hg clone` over HTTP) showed a similar speedup:
before: 5:53 CPU time
after: 4:46 CPU time
Wall time improvements were basically the same as CPU time.
I didn't measure explicitly, but it feels like most of the time
is saved when processing manifests. This makes sense, as large
manifests tend to have very long delta chains and thus benefit the
most from this cache.
So, this change effectively makes changegroup application (which is
used by `hg unbundle`, `hg clone`, `hg pull`, `hg unshelve`, and
various other commands) significantly faster when delta chains are
long (which can happen on repos with large numbers of files and thus
large manifests).
In theory, this change can result in more memory utilization. However,
we're caching a dict of ints. At most we have 200 ints + Python object
overhead per revlog. And, the cache is really only populated when
performing read-heavy operations, such as adding changegroups or
scanning an individual revlog. For memory bloat to be an issue, we'd
need to scan/read several revisions from several revlogs all while
having active references to several revlogs. I don't think there are
many operations that do this, so I don't think memory bloat from the
cache will be an issue.
This is the first place where we'll start using manifestctx instances instead of
manifestdict. This will facilitate using different manifestctx implementations
in the future.
The file caches we're using to avoid reloading the manifest from disk everytime
has an annoying bug that causes the in memory structure to not be reloaded if
the mtime and the size haven't changed. This causes a breakage in the tests
because the manifestlog is not being reloaded after a commit+strip operation in
mq (the mtime is the same because it all happens in the same second, and the
resulting size is the same because we add 1 and remove 1). The only reason this
doesn't affect the manifest itself is because we touch it so often that we
had already reloaded it after the commit, but before the strip.
Once the entire manifest has migrated to manifestlog, we can get rid of these
properties, since then the manifestlog will be touched after the commit, but
before the strip, as well.
This is the start of a large refactoring of the manifest class. It introduces
the new manifestlog and manifestctx classes which will represent the collection
of all manifests and individual instances, respectively.
Future patches will begin to convert usages of repo.manifest to
repo.manifestlog, adding the necessary functionality to manifestlog and instance
as they are needed.
As part of our refactoring to split the manifest concept from its storage, we
need to start moving the revlog specific parts of the manifest implementation to
a new class. This patch creates manifestrevlog and moves the fulltextcache onto
the base class.
The old manifest cache cached both the inmemory representation and the raw text.
As part of the manifest refactor we want to separate the storage format from the
in memory representation, so let's split this cache into two caches.
This will let other manifest implementations participate in the in memory cache,
while allowing the revlog based implementations to still depend on the full text
caching where necessary.
I've been baffled by this a couple of times (mainly wondering if any
callers of fancyopts.fancyopts that don't use gnu=True exist), so
let's just specify this as a keyword argument to preserve sanity.
Otherwise it would crash if template expression was passed.
This patch unifies the way how boolean expression is evaluated, which involves
BC. Before "if(true)" and "pad(..., 'false')" were False, which are now True
since they are boolean literal and non-empty string respectively.
"func is runsymbol" is the same hack as evalstringliteral(), which is needed
for label() to take color literals.
Before, False was True. This patch fixes the issue by processing True/False
transparently. The other values (including integer 0) are tested as strings
for backward compatibility, which means "if(latesttagdistance)" never be False.
Should we change the behavior of "if(0)" as well?
We can now specify a base map file:
__base__ = path/to/map/file
That map file will be read and used to populate unset elements of the
current map. Unlike using %include, elements in the inherited class
will be read relative to that path.
This makes it much easier to make custom local tweaks to a style.
Now that all users are in exchange, we can safely move the code in the
'exchange' module. This function is really about processing the argument of a
'getbundle' call, so it even makes senses to do so.
There is various version of this function that differ mostly by the way they
define the bundled set. The flexibility is now available in the outgoing object
itself so we move the complexity into the caller themself. This will allow use
to remove a good share of the similar function to obtains a changegroup in the
'changegroup.py' module.
An important side effect is that we stop calling 'computeoutgoing' in
'getchangegroup'. This is fine as code that needs such argument processing
is actually going through the 'exchange' module which already all this function
itself.
This argument can be used instead of 'commonheads' to determine the 'outgoing'
set. We remove the outgoingbetween function as its role can now be handled by
'outgoing' itself.
I've thought of using an external function instead of making the constructor
more complicated. However, there is low hanging fruit to improve the current
code flow by storing some side products of the processing of 'missingroots'. So
in my opinion it make senses to add all this to the class.
We are about to introduce a third option to create an outgoing object:
'missingroots'. This argument will be mutually exclusive with 'commonheads' so
we implement some default value handling in preparation.
This will also help use to make more use of outgoing creation around the code
base.
We are to introduce more code constructing such object in the code base. It will
be more convenient to pass a repository object, all current users already
operate at the repository level anyway. More changes to the contructor argument
are coming in later changeset.