Commit Graph

16682 Commits

Author SHA1 Message Date
Yuya Nishihara
a0741d6e1d formatter: add fm.nested(field) to either write or build sub items
We sometimes need to build nested items by formatter, but there was no
convenient way other than building and putting them manually by fm.data():

  exts = []
  for n, v in extensions:
      fm.plain('%s %s\n' % (n, v))
      exts.append({'name': n, 'ver': v})
  fm.data(extensions=exts)

This should work for simple cases, but doing this would make it harder to
change the underlying data type for better templating support.

So this patch provides fm.nested(field), which returns new nested formatter
(or self if items aren't structured and just written to ui.) A nested formatter
stores items which will later be rendered by the parent formatter.

  fn = fm.nested('extensions')
  for n, v in extensions:
      fn.startitem()
      fn.write('name ver', '%s %s\n', n, v)
  fn.end()

Nested items are directly exported to a template for now:

  {extensions % "{name} {ver}\n"}

There's no {extensions} nor {join(extensions, sep)} yet. I have a plan for
them by extending fm.nested() API, but I want to revisit it after trying
out this API in the real world.
2016-03-13 19:59:39 +09:00
Yuya Nishihara
2c363af9f2 formatter: factor out format*() functions to separate classes
New converter classes will be reused by a nested formatter. See the next
patch for details.

This change is also good in that the default values are defined uniquely
by the baseformatter.
2016-08-15 13:51:14 +09:00
Jun Wu
7957528e61 crecord: restore SIGWINCH handler before return
Previously, the SIGWINCH handler does not get cleared and if the commit
message editor also needs SIGWINCH handling (like vim), the two SIGWINCH
handlers (the editor's, ours) will have a race. And we may erase the
editor's screen content.

This patch restores SIGWINCH handler to address the above issue.
2016-08-24 11:24:07 +01:00
Maciej Fijalkowski
9fdc692e77 bdiff: implement cffi version of bdiff 2016-08-20 23:06:01 +02:00
Maciej Fijalkowski
a33764dd71 bdiff: implement cffi version of blocks 2016-07-28 14:17:08 +02:00
Tony Tung
b79d52dab0 util: checknlink should remove file it creates if an exception occurs
There's no reason to leave the file behind.
2016-08-19 13:30:40 -07:00
Siddharth Agarwal
52c9a999c5 merge: remove files with extra actions from merge action list
See the comment for a detailed explanation why.

Even though this is a bug, I've sent it to 'default' rather than 'stable'
because it isn't triggered in any code paths in stock Mercurial, just with the
merge driver included. For the same reason I haven't included any tests here --
the merge driver is getting a new test.
2016-08-23 17:58:53 -07:00
Gregory Szorc
e339efbb5b revlog: use an LRU cache for delta chain bases
Profiling using statprof revealed a hotspot during changegroup
application calculating delta chain bases on generaldelta repos.
Essentially, revlog._addrevision() was performing a lot of redundant
work tracing the delta chain as part of determining when the chain
distance was acceptable. This was most pronounced when adding
revisions to manifests, which can have delta chains thousands of
revisions long.

There was a delta chain base cache on revlogs before, but it only
captured a single revision. This was acceptable before generaldelta,
when _addrevision would build deltas from the previous revision and
thus we'd pretty much guarantee a cache hit when resolving the delta
chain base on a subsequent _addrevision call. However, it isn't
suitable for generaldelta because parent revisions aren't necessarily
the last processed revision.

This patch converts the delta chain base cache to an LRU dict cache.
The cache can hold multiple entries, so generaldelta repos have a
higher chance of getting a cache hit.

The impact of this change when processing changegroup additions is
significant. On a generaldelta conversion of the "mozilla-unified"
repo (which contains heads of the main Firefox repositories in
chronological order - this means there are lots of transitions between
heads in revlog order), this change has the following impact when
performing an `hg unbundle` of an uncompressed bundle of the repo:

before: 5:42 CPU time
after:  4:34 CPU time

Most of this time is saved when applying the changelog and manifest
revlogs:

before: 2:30 CPU time
after:  1:17 CPU time

That nearly a 50% reduction in CPU time applying changesets and
manifests!

Applying a gzipped bundle of the same repo (effectively simulating a
`hg clone` over HTTP) showed a similar speedup:

before: 5:53 CPU time
after:  4:46 CPU time

Wall time improvements were basically the same as CPU time.

I didn't measure explicitly, but it feels like most of the time
is saved when processing manifests. This makes sense, as large
manifests tend to have very long delta chains and thus benefit the
most from this cache.

So, this change effectively makes changegroup application (which is
used by `hg unbundle`, `hg clone`, `hg pull`, `hg unshelve`, and
various other commands) significantly faster when delta chains are
long (which can happen on repos with large numbers of files and thus
large manifests).

In theory, this change can result in more memory utilization. However,
we're caching a dict of ints. At most we have 200 ints + Python object
overhead per revlog. And, the cache is really only populated when
performing read-heavy operations, such as adding changegroups or
scanning an individual revlog. For memory bloat to be an issue, we'd
need to scan/read several revisions from several revlogs all while
having active references to several revlogs. I don't think there are
many operations that do this, so I don't think memory bloat from the
cache will be an issue.
2016-08-22 21:48:50 -07:00
Gregory Szorc
60ecfeec38 revlog: remove unused variables 2016-08-22 20:17:36 -07:00
Gregory Szorc
06b69e4299 util: properly implement lrucachedict.get()
Before, it was returning the raw _lrucachenode instance instead of its
value.
2016-08-22 20:30:37 -07:00
Durham Goode
3f8e8b7f90 manifest: change changectx to access manifest via manifestlog
This is the first place where we'll start using manifestctx instances instead of
manifestdict. This will facilitate using different manifestctx implementations
in the future.
2016-08-17 13:25:13 -07:00
Durham Goode
f38741166f manifest: use property instead of field for manifest revlog storage
The file caches we're using to avoid reloading the manifest from disk everytime
has an annoying bug that causes the in memory structure to not be reloaded if
the mtime and the size haven't changed. This causes a breakage in the tests
because the manifestlog is not being reloaded after a commit+strip operation in
mq (the mtime is the same because it all happens in the same second, and the
resulting size is the same because we add 1 and remove 1). The only reason this
doesn't affect the manifest itself is because we touch it so often that we
had already reloaded it after the commit, but before the strip.

Once the entire manifest has migrated to manifestlog, we can get rid of these
properties, since then the manifestlog will be touched after the commit, but
before the strip, as well.
2016-08-17 13:25:13 -07:00
Durham Goode
4c0439aa0a manifest: introduce manifestlog and manifestctx classes
This is the start of a large refactoring of the manifest class. It introduces
the new manifestlog and manifestctx classes which will represent the collection
of all manifests and individual instances, respectively.

Future patches will begin to convert usages of repo.manifest to
repo.manifestlog, adding the necessary functionality to manifestlog and instance
as they are needed.
2016-08-17 13:25:13 -07:00
Durham Goode
948314a949 manifest: make manifest derive from manifestrevlog
As part of our refactoring to split the manifest concept from its storage, we
need to start moving the revlog specific parts of the manifest implementation to
a new class. This patch creates manifestrevlog and moves the fulltextcache onto
the base class.
2016-08-17 13:25:13 -07:00
Durham Goode
9dfdbc1f92 manifest: break mancache into two caches
The old manifest cache cached both the inmemory representation and the raw text.
As part of the manifest refactor we want to separate the storage format from the
in memory representation, so let's split this cache into two caches.

This will let other manifest implementations participate in the in memory cache,
while allowing the revlog based implementations to still depend on the full text
caching where necessary.
2016-08-17 13:25:13 -07:00
Augie Fackler
afb1d21a9d dispatch: explicitly pass fancyopts optional arg as a keyword
I've been baffled by this a couple of times (mainly wondering if any
callers of fancyopts.fancyopts that don't use gnu=True exist), so
let's just specify this as a keyword argument to preserve sanity.
2016-08-18 11:32:02 -04:00
Maciej Fijalkowski
521b1aecb5 osutil: fix the bug on OS X when we return more in listdir
The pointer arithmetic somehow got ommitted during the recent change to use
a struct.
2016-08-20 23:05:18 +02:00
Hannes Oldenburg
03a02bd11b cmdutil: extract samefile function from amend() 2016-08-21 08:00:18 +00:00
Yuya Nishihara
6b6fa2e134 templater: rename "right" argument of pad() function
Before, right=True meant right justify, which I think is left padding.
2016-04-22 21:32:30 +09:00
Yuya Nishihara
f9c7bd7213 templater: make pad() evaluate boolean argument (BC)
Otherwise it would crash if template expression was passed.

This patch unifies the way how boolean expression is evaluated, which involves
BC. Before "if(true)" and "pad(..., 'false')" were False, which are now True
since they are boolean literal and non-empty string respectively.

"func is runsymbol" is the same hack as evalstringliteral(), which is needed
for label() to take color literals.
2016-04-22 21:29:13 +09:00
Yuya Nishihara
e86fcc2294 templater: fix if() to not evaluate False as bool('False')
Before, False was True. This patch fixes the issue by processing True/False
transparently. The other values (including integer 0) are tested as strings
for backward compatibility, which means "if(latesttagdistance)" never be False.

Should we change the behavior of "if(0)" as well?
2016-08-18 16:29:22 +09:00
Yuya Nishihara
03b3f18ede templater: make it clearer that _flatten() omits None 2016-08-18 15:55:07 +09:00
Gábor Stefanik
71039079d7 revset: support "follow(renamed.py, e22f4f3f06c3)" (issue5334)
v2: fixes from review
2016-08-18 17:25:10 +02:00
Matt Mackall
f6bd7a4c39 coal: use inheritance to derive from paper
This illustrates how much simpler this approach is, in particular the
effect of map-relative paths.
2016-08-17 13:43:13 -05:00
Matt Mackall
60e14951ba templater: add inheritance support to style maps
We can now specify a base map file:

__base__ = path/to/map/file

That map file will be read and used to populate unset elements of the
current map. Unlike using %include, elements in the inherited class
will be read relative to that path.

This makes it much easier to make custom local tweaks to a style.
2016-08-17 13:40:27 -05:00
Pierre-Yves David
7775b9bfe7 computeoutgoing: move the function from 'changegroup' to 'exchange'
Now that all users are in exchange, we can safely move the code in the
'exchange' module. This function is really about processing the argument of a
'getbundle' call, so it even makes senses to do so.
2016-08-09 17:06:35 +02:00
Pierre-Yves David
1b40b7e1c5 getchangegroup: take an 'outgoing' object as argument (API)
There is various version of this function that differ mostly by the way they
define the bundled set. The flexibility is now available in the outgoing object
itself so we move the complexity into the caller themself. This will allow use
to remove a good share of the similar function to obtains a changegroup in the
'changegroup.py' module.

An important side effect is that we stop calling 'computeoutgoing' in
'getchangegroup'. This is fine as code that needs such argument processing
is actually going through the 'exchange' module which already all this function
itself.
2016-08-09 17:00:38 +02:00
Pierre-Yves David
cca37f1814 outgoing: add a 'missingroots' argument
This argument can be used instead of 'commonheads' to determine the 'outgoing'
set. We remove the outgoingbetween function as its role can now be handled by
'outgoing' itself.

I've thought of using an external function instead of making the constructor
more complicated. However, there is low hanging fruit to improve the current
code flow by storing some side products of the processing of 'missingroots'. So
in my opinion it make senses to add all this to the class.
2016-08-09 22:31:38 +02:00
Pierre-Yves David
a9e69178e3 outgoing: adds some default value for argument
We are about to introduce a third option to create an outgoing object:
'missingroots'. This argument will be mutually exclusive with 'commonheads' so
we implement some default value handling in preparation.

This will also help use to make more use of outgoing creation around the code
base.
2016-08-09 15:55:44 +02:00
Pierre-Yves David
f460bb8823 outgoing: pass a repo object to the constructor
We are to introduce more code constructing such object in the code base. It will
be more convenient to pass a repository object, all current users already
operate at the repository level anyway. More changes to the contructor argument
are coming in later changeset.
2016-08-09 15:26:53 +02:00
Hannes Oldenburg
cca2fb75c5 match: remove matchessubrepo method (API)
Since it is no more used in cmdutil.{files,remove} and scmutil.addremove
we remove this method.
2016-08-16 08:21:16 +00:00
Hannes Oldenburg
38a18d3489 subrepo: cleanup of subrepo filematcher logic
Previously in the worst case we iterated the files in matcher twice and
had a method only for this, which reimplemented logic in subdirmatchers
constructor. So we replaced the method with a subdirmatcher.files() call.
2016-08-16 08:15:12 +00:00
Yuya Nishihara
69789d265a pycompat: delay loading modules registered to stub
Replacement _pycompatstub designed to be compatible with our demandimporter.
try-except is replaced by version comparison because ImportError will no longer
be raised immediately.
2016-08-14 14:46:24 +09:00
Yuya Nishihara
12eca1889e py3: import builtin wrappers automagically by code transformer
This should be less invasive than mucking builtins.

Since tokenize.untokenize() looks start/end positions of tokens, we calculates
them from the NEWLINE token of the future import.
2016-08-16 12:35:15 +09:00
Yuya Nishihara
29c5e8dc21 py3: provide (del|get|has|set)attr wrappers that accepts bytes
These functions will be imported automagically by our code transformer.

getattr() and setattr() are widely used in our code. We wouldn't probably
want to rewrite every single call of getattr/setattr. delattr() and hasattr()
aren't that important, but they are functions of the same kind.
2016-08-14 12:51:21 +09:00
Yuya Nishihara
df4c2c74a3 py3: check python version to enable builtins hack
Future patches will add (del|get|has|set)attr wrappers.
2016-08-14 12:44:13 +09:00
Yuya Nishihara
1532480b48 py3: move xrange alias next to import lines
Builtin functions should be available in compatibility code.
2016-08-14 12:41:54 +09:00
Yuya Nishihara
7105924c83 debugobsolete: add formatter support (issue5134)
It appears that computing index isn't cheap if --rev is specified. That's
why "index" field is available only if --index is specified.

I've named marker.flags() as "flag" because "flags" implies a list or dict
in template world.

Thanks to Piotr Listkiewicz for the initial implementation of this patch.
2016-08-15 16:07:55 +09:00
Yuya Nishihara
d586657c81 formatter: add function to convert dict to appropriate format
This will be used to process key-value pairs by formatter. The default
field names and format are derived from the {extras} template keyword.

Tests will be added later.
2016-08-15 12:58:33 +09:00
Gregory Szorc
312f42b6e4 hgweb: tweak zlib chunking behavior
When doing streaming compression with zlib, zlib appears to emit chunks
with data after ~20-30kb on average is available. In other words, most
calls to compress() return an empty string. On the mozilla-unified repo,
only 48,433 of 921,167 (5.26%) of calls to compress() returned data.
In other words, we were sending hundreds of thousands of empty chunks
via a generator where they touched who knows how many frames (my guess
is millions). Filtering out the empty chunks from the generator
cuts down on overhead.

In addition, we were previously feeding 8kb chunks into zlib
compression. Since this function tends to emit *compressed* data after
20-30kb is available, it would take several calls before data was
produced. We increase the amount of data fed in at a time to 32kb.
This reduces the number of calls to compress() from 921,167 to
115,146. It also reduces the number of output chunks from 48,433 to
31,377. This does increase the average output chunk size by a little.
But I don't think this will matter in most scenarios.

The combination of these 2 changes appears to shave ~6s CPU time
or ~3% from a server serving the mozilla-unified repo.
2016-08-14 21:29:46 -07:00
Gregory Szorc
118980f02b hgweb: document why we don't allow untrusted settings to control zlib
Added comment per discussion on mercurial-devel.
2016-08-15 20:39:33 -07:00
Gregory Szorc
4cfd8623b8 hgweb: profile HTTP requests
Currently, running `hg serve --profile` doesn't yield anything useful:
when the process is terminated the profiling output displays results
from the main thread, which typically spends most of its time in
select.select(). Furthermore, it has no meaningful results from
mercurial.* modules because the threads serving HTTP requests don't
actually get profiled.

This patch teaches the hgweb wsgi applications to profile individual
requests. If profiling is enabled, the profiler kicks in after
HTTP/WSGI environment processing but before Mercurial's main request
processing.

The profile results are printed to the configured profiling output.
If running `hg serve` from a shell, they will be printed to stderr,
just before the HTTP request line is logged. If profiling to a file,
we only write a single profile to the file because the file is not
opened in append mode. We could add support for appending to files
in a future patch if someone wants it.

Per request profiling doesn't work with the statprof profiler because
internally that profiler collects samples from the thread that
*initially* requested profiling be enabled. I have plans to address
this by vendoring Facebook's customized statprof and then improving
it.
2016-08-14 18:37:24 -07:00
Gregory Szorc
2ed4e485bc hgweb: abstract call to hgwebdir wsgi function
The function names and behavior now matches hgweb. The reason for this
will be obvious in the next patch.
2016-08-14 16:03:30 -07:00
Gregory Szorc
e6e0818daa profiling: don't error with statprof when profiling has already started
statprof.reset() asserts if profiling has already started. So don't
call if it profiling is already running.
2016-08-14 18:28:43 -07:00
Gregory Szorc
9ac5776ef7 profiling: add a context manager that no-ops if profiling isn't enabled
And refactor dispatch.py to use it. As you can see, the resulting code
is much simpler.

I was tempted to inline _runcommand as part of writing this series.
However, a number of extensions wrap _runcommand. So keeping it around
is necessary (extensions can't easily wrap runcommand because it calls
hooks before and after command execution).
2016-08-14 17:51:12 -07:00
Gregory Szorc
fbd4d1a639 profiling: make profiling functions context managers (API)
This makes profiling more flexible since we can now call multiple
functions when a profiler is active. But the real reason for this
is to enable a future consumer to profile a function that returns
a generator. We can't do this from the profiling function itself
because functions can either be generators or have return values:
they can't be both. So therefore it isn't possible to have a generic
profiling function that can both consume and re-emit a generator
and return a value.
2016-08-14 18:25:22 -07:00
Gregory Szorc
1aca3f1e38 dispatch: set profiling.enabled when profiling is enabled
We do this for other global command arguments. We don't for --profile
for reasons that are unknown to me. Probably because nobody has needed
it.

An upcoming patch will introduce a new consumer of the profiling
code. It doesn't have access to command line arguments. So let's
set the config option during argument processing.

We also remove a check for "options['profile']" because it is now
redundant.
2016-08-14 16:35:58 -07:00
Gregory Szorc
bdb3786ca0 profiling: move profiling code from dispatch.py (API)
Currently, profiling code lives in dispatch.py, which is a low-level
module centered around command dispatch. Furthermore, dispatch.py
imports a lot of other modules, meaning that importing dispatch.py
to get at profiling functionality would often result in a module import
cycle.

Profiling is a generic activity. It shouldn't be limited to command
dispatch. This patch moves profiling code from dispatch.py to the
new profiling.py. The low-level "run a profiler against a function"
functions have been moved verbatim. The code for determining how to
invoke the profiler has been extracted to its own function.

I decided to create a new module rather than stick this code
elsewhere (such as util.py) because util.py is already quite large.
And, I foresee this file growing larger once Facebook's profiling
enhancements get added to it.
2016-08-14 16:30:44 -07:00
Augie Fackler
cb268cbd2f merge with stable 2016-08-15 12:26:02 -04:00
Pulkit Goyal
0ce0d571e7 pycompat: avoid using an extra function
We have a single line function which just lowercase the letters and replaces
"_" with "". Its better to avoid that function call. Moreover we calling this
 function around 33 times.
2016-08-13 04:21:42 +05:30