Commit Graph

13507 Commits

Author SHA1 Message Date
Siddharth Agarwal
f6cb852638 dirstate: use parsers.make_file_foldmap when available
This is a significant performance win on large repositories. perffilefoldmap:

On Linux/gcc, on a test repo with over 500,000 files:
before: wall 0.605021 comb 0.600000 user 0.560000 sys 0.040000 (best of 17)
after:  wall 0.280530 comb 0.280000 user 0.250000 sys 0.030000 (best of 35)

On Mac OS X/clang, on a real-world repo with over 200,000 files:
before: wall 0.281103 comb 0.280000 user 0.260000 sys 0.020000 (best of 34)
after:  wall 0.133622 comb 0.140000 user 0.120000 sys 0.020000 (best of 65)

This visibly impacts status times on case-insensitive file systems. On the Mac
OS X repo, status goes from 3.64 seconds to 3.50.

With the third-party hgwatchman extension [1], 'hg status' on the same repo
goes from 0.80 seconds to 0.65.

[1] https://bitbucket.org/facebook/hgwatchman
2015-04-01 00:44:33 -07:00
Siddharth Agarwal
1b263fddd0 parsers: add a C function to create a file foldmap
This is a hot path on case-insensitive filesystems -- it's guaranteed to be
called every time 'hg status' is run.

This is significantly faster than the equivalent Python code: see the following
patch for numbers.
2015-03-31 23:32:27 -07:00
Siddharth Agarwal
fada28ff91 util.h: define an enum for normcase specs
These will be used in upcoming patches to efficiently create a dirstate
foldmap.
2015-04-02 19:17:32 -07:00
Siddharth Agarwal
6beb10735e parsers._asciitransform: also accept a fallback function
This function will be used in upcoming patches to provide a C implementation of
the function to generate the foldmap.
2015-03-31 23:22:03 -07:00
Siddharth Agarwal
16d94fc24b util: add normcase spec and fallback
These will be used in upcoming patches to efficiently create a dirstate
foldmap.
2015-04-01 00:38:56 -07:00
Yuya Nishihara
91233c9b15 jsonchangeset: set manifest node to "null" for workingctx
Unlike changeset_printer, it does not hide the manifest field because JSON
output will be parsed by machine where explicit "null" will be more useful
than nothing.
2015-03-14 20:16:35 +09:00
Yuya Nishihara
8a805d1c6d jsonchangeset: set rev and node to "null" for workingctx 2015-03-14 20:15:40 +09:00
Yuya Nishihara
6c27e6c97a templater: tell hggettext to collect help of template functions 2015-04-03 21:36:39 +09:00
Martin von Zweigbergk
eeace59f46 treemanifest: disable readdelta optimization
When tree manifests are stored with one revlog per directory and
loaded lazily, it's unclear how much readdelta will help. If only a
few files change, then only a small part of the full manifest will be
loaded, and the delta chains should also be shorter for tree
manifests. Therefore, let's disable readdelta for tree manifests for
now.
2015-03-10 09:57:42 -07:00
Laurent Charignon
a2ad3d1abd phases: make two functions private for phase computation 2015-03-30 15:38:24 -07:00
Siddharth Agarwal
047970182b windows: define normcase spec and fallback
These will be used in upcoming patches to efficiently create a dirstate
foldmap.
2015-04-01 00:31:41 -07:00
Siddharth Agarwal
471dc0d569 encoding.upper: factor out fallback code
This will be used as the fallback function on Windows.
2015-04-01 00:30:41 -07:00
Siddharth Agarwal
1d374c9821 cygwin: define normcase spec and fallback
These will be used in upcoming patches to efficiently create a dirstate
foldmap.

The Cygwin normcase behavior is more complicated than just a simple lowercasing
or uppercasing. That's why we specify 'other'.
2015-04-01 00:29:22 -07:00
Siddharth Agarwal
fdd75c8724 darwin: define normcase spec and fallback
These will be used in upcoming patches to efficiently create a dirstate
foldmap.
2015-03-31 23:30:19 -07:00
Siddharth Agarwal
180770be60 posix: define normcase spec and fallback
These will be used in upcoming patches to efficiently create a dirstate
foldmap.
2015-04-01 00:26:07 -07:00
Siddharth Agarwal
950e16d188 encoding: define an enum that specifies what normcase does to ASCII strings
For C code we don't want to pay the cost of calling into a Python function for
the common case of ASCII filenames. However, while on most POSIX platforms we
normalize filenames by lowercasing them, on Windows we uppercase them. We
define an enum here indicating the direction that filenames should be
normalized as. Some platforms (notably Cygwin) have more complicated
normalization behavior -- we add a case for that too.

In upcoming patches we'll also define a fallback function that is called if the
string has non-ASCII bytes.

This enum will be replicated in the C code to make foldmaps. There's
unfortunately no nice way to avoid that -- we can't have encoding import
parsers because of import cycles. One way might be to have parsers import
encoding, but accessing Python modules from C code is just awkward.

The name 'normcasespecs' was chosen to indicate that this is merely an integer
that specifies a behavior, not a function. The name was pluralized since in
upcoming patches we'll introduce 'normcasespec' which will be one of these
values.
2015-04-01 00:21:10 -07:00
Matt Mackall
e5803764be merge with stable 2015-04-02 16:51:00 -05:00
Yuya Nishihara
2f81a23f54 hgweb: resurrect <span> tag on diffline to fix rendering in monoblue style
It was removed at 9d1f6b229886 as a useless tag, but it is necessary to
apply "div.diff pre span" style.

http://selenic.com/repo/hg/rev/9d1f6b229886?style=monoblue
2015-04-02 21:29:05 +09:00
Gregory Szorc
6ce5b94bd9 json: implement {help} template
We should consider add HTML rendering of the RST into the response as a
follow-up. I attempted to do this, but there was an empty array
returned by the rstdoc() template function. Not sure what's going on.
Will deal with it later.
2015-04-01 22:24:03 -07:00
Gregory Szorc
26f4be7d62 json: implement {helptopics} template 2015-04-01 22:16:05 -07:00
Gregory Szorc
47b0fd3ed6 json: implement {manifest} template
Property naming was borrowed from `hg files -Tjson`.

We omit branch because, again, representation of branches in this
template is wonky.
2015-04-01 22:04:03 -07:00
Gregory Szorc
351f923f65 json: implement {shortlog} and {changelog} templates
These are the same dispatch function under the hood. The only difference
is the default number of entries to render and the template to use. So
it makes sense to use a shared template.

Format for {changelistentry} is similar to {changeset}. However, there
are differences to argument names and their values preventing us from
(easily) using the same template. (Perhaps there is room to consolidate
the templates as a follow-up.)

We're currently not recording some data in {changelistentry} that exists
in {changeset}. This includes the branch name. This should be added in
a follow-up. For now, something is better than nothing.
2015-03-31 22:53:48 -07:00
Gregory Szorc
acb5f54ecc help: populate template functions via docstrings
We do this for revsets, template keywrods, and template filters. Now we
do it for template functions as well.
2015-04-01 20:23:58 -07:00
Gregory Szorc
b6dd5a0076 templater: add consistent docstrings to functions
The content of "hg help templating" is largely derived from docstrings
on functions providing functionality. Template functions are the long
holdout.

Prepare for generating them dynamically by defining docstrings for all
template functions.

There are numerous ways these docs could be improved. Right now, the
help output simply shows function names and arguments. So literally
any accurate data is better than what is there now.
2015-04-01 20:19:43 -07:00
Matt Harbison
cdcba76207 forget: cleanup the output for an inexact case match on icasefs
Previously, specifying a file name but not matching the dirstate case yielded
the following, even though the file was actually removed:

  $ hg forget capsdir1/capsdir/abc.txt
  not removing capsdir\a.txt: file is already untracked
  removing CapsDir\A.txt
  [1]

This change doesn't appear to cause any extra filesystem accesses, even if a
nonexistant file is specified.

If a directory is specified without a case match, it is (and was previously)
still silently ignored.
2015-03-31 17:42:46 -04:00
Matt Harbison
a3480284f9 dirstate: don't require exact case when adding dirs on icasefs (issue4578)
We don't require it when adding files on a case insensitive filesystem, so don't
require it to add directories for consistency.

The problem with the previous code was that _walkexplicit() was only returning
the normalized directory.  The file(s) in the directory are then appended, and
passed to the matcher.  But if the user asks for 'capsdir1/capsdir', the matcher
will not accept 'CapsDir1/CapsDir/AbC.txt', and the name is dropped.  Matching
based on the non-normalized name is required.

If not normalizing, skip the extra string building for efficiency.  '.' is
replaced with '' so that the path being tested when no file is specified, isn't
prefixed with './' (and therefore fail the match).
2015-03-31 11:11:39 -04:00
Nathan Goldbaum
08c916638d filemerge: clean up language in mergemarkertemplate help 2015-03-31 11:58:14 -07:00
Yuya Nishihara
4a691471f2 templates: fix "log -q" output of phases style
It had the same problem as be4dab229c78, name conflicts of {node} keyword.
2015-03-28 20:22:03 +09:00
Matt Harbison
e1c0e69712 win32: 'raise ctypes.WinError' -> 'raise ctypes.WinError()'
WinError is a function that creates an Error, not an Error itself.  This is a
partial backout of a9badcbcfb79.
2015-03-22 19:08:13 -04:00
Pierre-Yves David
41927328f0 mergecopies: reuse ancestry context when traversing file history (issue4537)
Merge copies is traversing file history in search for copies and renames.
Since 3.3 we are doing "linkrev adjustment" to ensure duplicated filelog entry
does not confuse the traversal. This "linkrev adjustment" involved ancestry
testing and walking in the changeset graph. If we do such walk in the changesets
graph for each file, we end up with a 'O(<changesets>x<files>)' complexity
that create massive issue. For examples, grafting a changeset in Mozilla's repo
moved from 6 seconds to more than 3 minutes.

There is a mechanism to reuse such ancestors computation between all files. But
it has to be manually set up in situation were it make sense to take such
shortcut. This changesets set this mechanism up and bring back the graph time
from 3 minutes to 8 seconds.

To do so, we need a bigger control on the way 'filectx' are instantiated during
each 'checkcopies' calls that 'mergecopies' is doing. We add a new 'setupctx'
that configure and return a 'filectx' factory. The function make sure the
ancestry context is properly created and the factory make sure it is properly
installed on returned 'filectx'.
2015-03-20 00:30:35 -07:00
Pierre-Yves David
ea1a0fd29f adjustlinkrev: handle 'None' value as source
When the source rev value is 'None', the ctx is a working context. We
cannot compute the ancestors from there so we directly skip to its
parents. This will be necessary to allow 'None' value for
'_descendantrev' itself necessary to make all contexts used in
'mergecopies' reuse the same '_ancestrycontext'.
2015-03-19 23:57:34 -07:00
Pierre-Yves David
7605b9a4cf adjustlinkrev: prepare source revs for ancestry only once
We'll need some more complex initialisation to handle workingfilectx
case. We do this small change in a different patch for clarity.
2015-03-19 23:52:26 -07:00
Pierre-Yves David
78f35efc70 annotate: reuse ancestry context when adjusting linkrev (issue4532)
The linkrev adjustment will likely do the same ancestry walking multiple time
so we already have an optional mechanism to take advantage of this. Since
4e4e9e954fae, linkrev adjustment was done lazily to prevent too bad performance
impact on rename computation. However, this laziness created a quadratic
situation in 'annotate'.

Mercurial repo: hg annotate mercurial/commands.py
before:   8.090
after:  36.300

Mozilla repo: hg annotate layout/generic/nsTextFrame.cpp
before:   1.190
after:  290.230


So we setup sharing of the ancestry context in the annotate case too. Linkrev
adjustment still have an impact but it a much more sensible one.

Mercurial repo: hg annotate mercurial/commands.py
before:  36.300
after:   10.230

Mozilla repo: hg annotate layout/generic/nsTextFrame.cpp
before: 290.230
after:    5.560
2015-03-19 19:52:23 -07:00
Yuya Nishihara
990aa241a1 templates: fix "log -q" output of default style
It was changed at ad92c202bbcd unintentionally due to name conflicts.
2015-03-14 22:34:27 +09:00
Yuya Nishihara
fcd69a9d8b changeset_printer: hide manifest node for workingctx
Because workingctx has no manifest, it makes sense to hide "manifest:" row
completely.
2015-03-14 17:33:22 +09:00
Yuya Nishihara
f52009fe3b changeset_printer: display p1rev:p1node with "+" suffix for workingctx
Still templater can't handle workingctx, which will be fixed later.
2015-03-14 20:01:30 +09:00
Yuya Nishihara
c5790c61c0 changeset_printer: handle workingctx in _meaningful_parentrevs() 2015-03-14 17:29:48 +09:00
Yuya Nishihara
2d0b41ce60 scmutil: add function to help handling workingctx in arithmetic operation
It's unfortunate that workingctx revision is None, which doesn't work well in
arithmetic operation or comparison. This function is trivial but will be used
in several places.
2015-03-14 19:38:59 +09:00
Siddharth Agarwal
9e6d9e8c62 encoding: use parsers.asciiupper when available
This is used on Windows and Cygwin, and the gains from this are expected to be
similar to what was seen in 39fbe33f95fa.
2015-03-31 15:22:09 -07:00
Siddharth Agarwal
863cc77d5f parsers: introduce an asciiupper function 2015-03-31 13:46:21 -07:00
Siddharth Agarwal
a7ee003c9b parsers: make _asciilower a generic _asciitransform function
We can now pass in whatever table we like. For example, an upcoming patch will
introduce asciiupper.
2015-03-31 10:28:17 -07:00
Siddharth Agarwal
d3a8c5ea20 parsers._asciilower: use an explicit return object
No functional change, but this will make upcoming patches cleaner.
2015-04-01 13:58:51 -07:00
Siddharth Agarwal
e455eaff24 parsers: factor out most of asciilower into an internal function
We're going to reuse this in upcoming patches.

The change to Py_ssize_t is necessary because parsers.c doesn't define
PY_SSIZE_T_CLEAN. That macro changes the behavior of PyArg_ParseTuple but not
PyBytes_GET_SIZE.
2015-03-31 10:25:29 -07:00
Martin von Zweigbergk
ebd2a39ab3 manifestv2: add support for writing new manifest format
If .hg/requires has 'manifestv2', the manifest will be written using
the new format.
2015-03-31 14:01:33 -07:00
Martin von Zweigbergk
c5433d6da0 manifestv2: add support for reading new manifest format
The new manifest format is designed to be smaller, in particular to
produce smaller deltas. It stores hashes in binary and puts the hash
on a new line (for smaller deltas). It also uses stem compression to
save space for long paths. The format has room for metadata, but
that's there only for future-proofing. The parser thus accepts any
metadata and throws it away. For more information, see
http://mercurial.selenic.com/wiki/ManifestV2Plan.

The current manifest format doesn't allow an empty filename, so we use
an empty filename on the first line to tell a manifest of the new
format from the old. Since we still never write manifests in the new
format, the added code is unused, but it is tested by
test-manifest.py.
2015-03-27 22:26:41 -07:00
Martin von Zweigbergk
e931247479 manifestv2: set requires at repo creation time
While it should be safe to switch to the new manifest format on an
existing repo, let's keep it simple for now and make the configuration
have any effect only at repo creation time. If the configuration is
enabled then (at repo creation), we add an entry to requires and read
that instead of the configuration from then on.
2015-03-31 22:45:45 -07:00
Yuya Nishihara
97a7438eb1 templatefilters: add "upper" and "lower" for case conversion
Typically it will be used in patchbomb's flag template, which will be
implemented by future patches.
2015-03-30 23:54:29 +09:00
Durham Goode
e33ea26dd5 repoview: improve compute staticblockers perf
Previously we would compute the repoview's static blockers by finding all the
children of hidden commits that were not hidden.  This was O(number of commits
since first hidden change) since 'children' requires walking every commit from
tip until the first hidden change.

The new algorithm walks all heads down until it sees a public commit. This makes
the computation O(number of draft) commits, which is much faster in large
repositories with a large number of commits and a low number of drafts.

On a large repo with 1000+ obsolete markers and the earliest draft commit around
tip~200000, this improves computehidden perf by 200x (2s to 0.01s).
2015-04-01 12:50:10 -07:00
Gregory Szorc
35a9613552 hgweb: add phase to {changeset} template
It's pretty surprising phase wasn't part of this template call already.
We now expose {phase} to the {changeset} template and we expose this
data to JSON.

This brings JSON output in line with the output from `hg log -Tjson`.
The lone exception is hweb doesn't print the numeric rev. As has been
stated previously, I don't believe hgweb should be exposing these
unstable identifiers. (We can add them later if we really want them.)
There is still work to bring hgweb in parity with --verbose and
--debug output from the CLI.
2015-03-31 22:29:12 -07:00
Gregory Szorc
3b964c6e89 json: implement {changeset} template
Output only contains basic changeset information for the moment. The
format is compatible with `hg log -Tjson`.
2015-03-31 22:35:12 -07:00