Commit Graph

17876 Commits

Author SHA1 Message Date
Remi Chaintron
dfc79cbfc3 revlog: flag processor
Add the ability for revlog objects to process revision flags and apply
registered transforms on read/write operations.

This patch introduces:
- the 'revlog._processflags()' method that looks at revision flags and applies
  flag processors registered on them. Due to the need to handle non-commutative
  operations, flag transforms are applied in stable order but the order in which
  the transforms are applied is reversed between read and write operations.
- the 'addflagprocessor()' method allowing to register processors on flags.
  Flag processors are defined as a 3-tuple of (read, write, raw) functions to be
  applied depending on the operation being performed.
- an update on 'revlog.addrevision()' behavior. The current flagprocessor design
  relies on extensions to wrap around 'addrevision()' to set flags on revision
  data, and on the flagprocessor to perform the actual transformation of its
  contents. In the lfs case, this means we need to process flags before we meet
  the 2GB size check, leading to performing some operations before it happens:
  - if flags are set on the revision data, we assume some extensions might be
    modifying the contents using the flag processor next, and we compute the
    node for the original revision data (still allowing extension to override
    the node by wrapping around 'addrevision()').
  - we then invoke the flag processor to apply registered transforms (in lfs's
    case, drastically reducing the size of large blobs).
  - finally, we proceed with the 2GB size check.

Note: In the case a cachedelta is passed to 'addrevision()' and we detect the
flag processor modified the revision data, we chose to trust the flag processor
and drop the cachedelta.
2017-01-10 16:15:21 +00:00
Remi Chaintron
bd07cff7ec revlog: pass revlog flags to addrevision
Adding the ability to passing flags to addrevision instead of simply passing
default flags to _addrevision will allow extensions relying on flag transforms
to wrap around addrevision() in order to update revlog flags.

The first use case of this patch will be the lfs extension marking nodes as
stored externally when the contents are larger than the defined threshold.

One of the reasons leading to setting flags in addrevision() wrappers in the
flag processor design is that it allows to detect files larger than the 2GB
limit before the check is performed, which allows lfs to transform the contents
into metadata.
2017-01-05 17:16:07 +00:00
Remi Chaintron
6d11b9177b revlog: add 'raw' argument to revision and _addrevision
This patch introduces a new 'raw' argument (defaults to False) to revlog's
revision() and _addrevision() methods.
When the 'raw' argument is set to True, it indicates the revision data should be
handled as raw data by the flagprocessor.

Note: Given revlog.addgroup() calls are restricted to changegroup generation, we
can always set raw to True when calling revlog._addrevision() from
revlog.addgroup().
2017-01-05 17:16:07 +00:00
Jun Wu
b61b02a865 chg: remove getpager support
We have enough bits to switch to the new chg pager code path in runcommand.
So just remove the legacy getpager support.

This is a red-only patch, and will break chg's pager support temporarily.
2017-01-10 06:59:39 +08:00
Jun Wu
2fc8d9fe86 chgserver: implement chgui._runpager
This patch implements chgui._runpager in a relatively simple way. A more
clean way is to move the core logic of "attachio" to "ui", which will be
done later after chg runs uisetup per request.
2017-01-10 06:59:31 +08:00
Jun Wu
ed9bebc440 chgserver: make S channel support pager request
This patch adds the "pager" support for the S channel. The pager API allows
running some subcommands, namely attachio, and waiting for the client to be
properly synchronized.
2017-01-10 06:59:21 +08:00
Jun Wu
7085592213 chgserver: use util.shellenviron
This avoids code duplication.
2017-01-10 06:58:51 +08:00
Jun Wu
c50e85b0d3 util: extract the logic calculating environment variables
The method will be reused in chgserver. Move it out so it can be reused.
2017-01-10 06:58:02 +08:00
Anton Shestakov
8d71b91ef9 hgweb: generate archive links in order
It would be nice for archive links to always be in a certain commonly used
order, such as 'zip', 'bz', 'gzip2'. Repo index page (hgwebdir_mod) already
shows archive links in this order, let's do the same in hgweb_mod.

Sadly, archivespecs is a regular unordered dict, and collections.OrderedDict is
new in 2.7. But requestcontext.archives is a tuple of archive types, so it can
be used as an index to archivespecs.
2017-01-08 00:52:54 +08:00
Anton Shestakov
5dfa3509d4 hgweb: use archivespecs (dict) instead of archives (tuple) for "in" check 2017-01-08 01:24:45 +08:00
Matt Harbison
1e958d800e help: merge the various operator sections of revsets, filesets and templates
Having sections for specific operator types assumes the user already knows what
type of operators are supported.  By having a common heading, the user can
simply lookup help for "(revsets|filesets|templates).operators".
2017-01-08 12:05:10 -05:00
Matt Harbison
28a570f58c help: apply the section headings from revsets to templates
Unlike filesets, there are a few distinct headings that are not shared with
revsets.  But common names are used where possible.
2017-01-08 02:43:01 -05:00
Matt Harbison
0b96ef6f6b help: apply the section headings from revsets to filesets
This has the nice property of visually breaking up the wall of text.  It also
allows specific smaller sections to be called out.  For example,
`hg help filesets.predicates` now prints just the predicate section.  At the
moment, the revset headings are a superset of the fileset headings, so there is
consistency in how example, predicate and operator help is called out.

The reference to `hg help patterns` was moved to the overview section, so that
it isn't stuck in the examples section.
2017-01-08 02:40:36 -05:00
Jun Wu
56484854f9 chgserver: check type passed to S channel
It currently only supports the "system" type. Add an explicit check.
2017-01-06 16:12:25 +00:00
Jun Wu
734e02b02d chg: send type information via S channel (BC)
Previously S channel is only used to send system commands. It will also be
used to send pager commands. So add a type parameter.

This breaks older chg clients. But chg and hg should always come from a
single commit and be packed into a single package. Supporting running
inconsistent versions of chg and hg seems to be unnecessarily complicated
with little benefit. So just make the change and assume people won't use
inconsistent chg with hg.
2017-01-06 16:11:03 +00:00
Yuya Nishihara
630684236e commit: fix unmodified message detection for the "--- >8 ----" magic
We need the raw editortext to be compared with the templatetext.
2017-01-06 22:50:04 +09:00
Denis Laxalde
6dab59dff8 summary: use ui.label and join to write evolution troubles
Follow-up on da7b2bf5ad52 to avoid a convoluted loop.
2017-01-07 12:24:15 +01:00
Denis Laxalde
ea885ed1d6 log: drop unnecessary ui.note label from "trouble: " line
Follow-up on 38b8a4a2230c and 3f2425cfd46f.
2017-01-07 12:07:56 +01:00
Denis Laxalde
20d1dad252 revset: add a followlines(file, fromline, toline[, rev]) revset
This revset returns the history of a range of lines (fromline, toline) of a
file starting from `rev` or the current working directory.

Added tests in test-annotate.t which already contains a reasonably complex
repository.
2017-01-04 16:47:49 +01:00
Denis Laxalde
7092fa95d8 context: add a blockancestors(fctx, fromline, toline) function
This yields ancestors of `fctx` by only keeping changesets touching the file
within specified linerange = (fromline, toline).

Matching revisions are found by inspecting the result of `mdiff.allblocks()`,
filtered by `mdiff.blocksinrange()`, to find out if there are blocks of type
"!" within specified line range.

If, at some iteration, an ancestor with an empty line range is encountered,
the algorithm stops as it means that the considered block of lines actually
has been introduced in the revision of this iteration. Otherwise, we finally
yield the initial revision of the file as the block originates from it.

When a merge changeset is encountered during ancestors lookup, we consider
there's a diff in the current line range as long as there is a diff between
the merge changeset and at least one of its parents (in the current line
range).
2016-12-28 23:03:37 +01:00
Denis Laxalde
dc8e8fcbf9 mdiff: add a "blocksinrange" function to filter diff blocks by line range
The function filters diff blocks as generated by mdiff.allblock function based
on whether they are contained in a given line range based on the "b-side" of
blocks.
2017-01-03 18:15:58 +01:00
Denis Laxalde
5e3ca8d1ab summary: add evolution "troubles" information to summary output
Extend the "parent: " lines in summary with the list of evolution "troubles"
in parentheses, when the parent is troubled.
2017-01-06 14:35:22 +01:00
Denis Laxalde
8ebbb5679b summary: use the same labels as log command in "parent: " line
Re-use the cmdutil._changesetlabels function introduced in c400c86d547f to
have consistent labels between the "changeset: " line in log command and the
"parent: " line in summary.
2017-01-06 14:34:34 +01:00
Denis Laxalde
b2aed04403 templates: display evolution "troubles" in command line style 2017-01-06 13:50:52 +01:00
Denis Laxalde
0c89f1cb3e templatekw: add a "troubles" template keyword
The "troubles" template keyword returns a list of evolution troubles.
It is EXPERIMENTAL, as anything else related to changeset evolution.

Test it in test-obsolete.t which has troubled changesets.
2017-01-06 13:50:16 +01:00
Denis Laxalde
627f47db5c cmdutil: add missing "i18n" comment about "trouble: " line
Follow-up on 38b8a4a2230c per late review.
2017-01-06 12:36:21 +01:00
Gregory Szorc
05ec82c913 hgweb: link to raw-file on annotation page (BC)
Every other template has the "raw" link load "raw-file." However,
fileannotate.tmpl's "raw" link loads "raw-annotate." This feels
inconsistent and wrong.

As far as I can tell, linking to the "raw annotate" view has occurred
since 2006.
2016-12-28 15:48:17 -07:00
Martin von Zweigbergk
e1f0ba8ef9 repair: combine two loops over changelog revisions
This just saves a few lines.
2017-01-04 10:35:04 -08:00
Martin von Zweigbergk
92d0334538 repair: speed up stripping of many roots
repair.strip() expects a set of root revisions to strip. It then
builds the full set of descedants by walking the descandants of
each. It is rare that more than a few roots get passed in, but if that
happens, it will wastefully walk the changelog for each root. So let's
just walk it once.

I noticed this because the narrowhg extension was passing not only
roots, but all the commits to strip. When there were tens of thousands
of commits to strip, this resulted in quadratic behavior with that
extension.
2017-01-04 10:07:12 -08:00
Sean Farley
7f456ac7c6 config: add docs for ignoring all text below in the editor
This is an example of how to use the new skip-from-there string for ignoring the
diff in a commit message.
2017-01-04 22:32:42 -06:00
Sean Farley
52b92c45af cmdutil: add special string that ignores rest of text
Similar to git, we add a special string:

  HG: ------------------------ >8 ------------------------

that means anything below it is ignored in a commit message.

This is helpful for integrating with third-party tools that display the
2016-12-31 15:36:36 -06:00
Pierre-Yves David
09a5f07adf localrepo: deprecated '_link'
That method had a total on 1 internal user...


G: changed mercurial/localrepo.py
2016-08-05 14:15:45 +02:00
Pierre-Yves David
2634c22ce0 localrepo: use self.wvfs.islink directly
We are about to deprecate the helper function.
2016-08-05 14:19:31 +02:00
Pulkit Goyal
fcc99d385f py3: convert opts back to bytes for status 2017-03-16 10:10:00 +05:30
Gregory Szorc
206cd3557e parsers: handle refcounting of "parents" consistently
Py_None can be refcounted like any other Python object. So
do that.
2017-03-13 17:49:13 -07:00
Martin von Zweigbergk
7b5dab5409 py3: make py3 compat.iterbytestr simpler and faster
With Python 3.4.3, timit says 11.9 usec-> 6.44 usec. With Python
3.6.0, timeit says 14.1 usec -> 9.55 usec.
2017-03-15 09:32:18 -07:00
Martin von Zweigbergk
cf1f0d3920 py3: optimize py3 compat.bytechr using Struct.pack
With Python 3.4.3, timeit says 0.437 usec -> 0.0685 usec. With Python
3.6, timeit says 0.157 usec -> 0.0907 usec. So it's faster on both
versions, but the speedup varies a lot.

Thanks to Gregory Szorc for the suggestion.
2017-03-15 09:30:50 -07:00
Pulkit Goyal
f9b7821158 dirstate: use list comprehension to get a list of keys
We have used dict.keys() which returns a dict_keys() object instead
of list on Python 3. So this patch replaces that with list comprehension
which works both on Python 2 and 3.
2017-03-16 09:00:27 +05:30
Pulkit Goyal
5a7c5d918e match: slice over bytes to get the byteschr instead of ascii value 2017-03-16 08:03:51 +05:30
Pulkit Goyal
0bad6ee6aa match: make regular expression bytes to prevent TypeError 2017-03-16 07:52:47 +05:30
Pulkit Goyal
7ed6d8245e scmutil: make function name bytes in class filecache
func.__name__ returns unicodes and this leads to keyerror when we try
to do filecache[''] by passing bytes.
2017-03-16 06:32:33 +05:30
Pierre-Yves David
0923ede90f localrepo: deprecate 'wfile'
The method had very few users and the modern form is shorter. So let us
deprecates another method of the localrepo class.
2017-03-15 00:27:17 -07:00
Pierre-Yves David
5fc4dd0997 localrepo: use 'wvfs' instead of 'wfile'
Method is about to be deprecated and the modern form is shorter.
2017-03-15 00:29:09 -07:00
Pierre-Yves David
3298e01804 tagmerge: use 'wvfs' instead of 'wfile'
Method is about to be deprecated and the modern form is shorter.
2017-03-15 00:28:58 -07:00
Pierre-Yves David
2a46795ed1 localrepo: don't use mutable default argument value
Caught by pylint.
2017-03-14 23:50:07 -07:00
Pierre-Yves David
c4d32111df httpclient: don't use mutable default argument value
Caught by pylint.
2017-03-14 23:49:25 -07:00
Gregory Szorc
1c399d6b97 util: make strdate's defaults default value a dict
It was specified to be an empty list in 3d8abfdaa08a in 2007.
It was correct at the time. But when the function was
refactored in 7bca0f2718ab (2010), it started expecting a dict.
I guess this code path is untested?

Thanks to Yuya for spotting this.
2017-03-14 08:51:35 -07:00
Rishabh Madan
6a6d5ec05c py3: open file in rb mode 2017-03-15 14:51:18 +05:30
Kyle Lippincott
a0eab21ffc debuglabelcomplete: fix to call debugnamecomplete in new location
debugnamecomplete was moved in a9aa67ba from commands to debugcommands, but
debuglabelcomplete was not modified to call it in its new location.
2017-03-14 13:10:30 -07:00
Gregory Szorc
75a74f883f pycompat: custom implementation of urllib.parse.quote()
urllib.parse.quote() accepts either str or bytes and returns str.

There exists a urllib.parse.quote_from_bytes() which only accepts
bytes. We should probably use that to retain strong typing and
avoid surprises.

In addition, since nearly all strings in Mercurial are bytes, we
probably don't want quote() returning unicode.

So, this patch implements a custom quote() that only accepts bytes
and returns bytes. The quoted URL should only contain URL safe
characters which is a strict subset of ASCII. So
`.encode('ascii', 'strict')` should be safe.
2017-03-13 12:16:47 -07:00
Gregory Szorc
db258a9802 pycompat: alias urllib symbols directly
urllib.request imports a bunch of symbols from other urllib
modules. We should map to the original symbols not the
re-exported ones because this is more correct. Also, it
will prevent an import of urllib.request if only one of
the lower-level symbols/modules is needed.
2017-03-13 12:14:17 -07:00
Gregory Szorc
dc53c9ca1b formatter: support json formatting of long type
By luck, we appear to not pass any long instances into
the JSON formatter. I suspect this will change with all the
Python 3 porting work. Plus I have another series that will
convert some ints to longs that triggers this.
2017-03-13 18:31:29 -07:00
Gregory Szorc
fbefc30c75 util: don't use mutable default argument value
I don't think this is any tight loops and we'd need to worry about
PyObject creation overhead. Also, I'm pretty sure strptime()
will be much slower than PyObject creation (date parsing is
surprisingly slow).
2017-03-12 21:54:32 -07:00
Gregory Szorc
98c99b99fa match: don't use mutable default argument value
There shouldn't be a big perf hit creating a new object because
this function is complicated and does things that dwarf the cost
of creating a new PyObject.
2017-03-12 21:53:03 -07:00
Gregory Szorc
d2e9e46760 hgweb: don't use mutable default argument value 2017-03-12 21:52:17 -07:00
Gregory Szorc
5cc9a634fe hgweb: don't use mutable default argument value 2016-12-26 16:55:47 -07:00
Gregory Szorc
9bc6055676 filemerge: don't use mutable default argument value 2016-12-26 16:54:33 -07:00
Gregory Szorc
b1f8fb51cd context: don't use mutable default argument value
Mutable default argument values are a Python gotcha and can
represent subtle, hard-to-find bugs. Lets rid our code base
of them.
2017-03-12 21:50:42 -07:00
Martin von Zweigbergk
b7d134a4b3 heads: enable pager 2017-03-13 11:19:24 -07:00
Martin von Zweigbergk
d6f2ec5990 branches: enable pager 2017-03-13 11:03:59 -07:00
Yuya Nishihara
0b54547397 py3: fix slicing of bytes in revset.formatspec() 2017-03-12 17:16:43 -07:00
Yuya Nishihara
ce52228976 py3: make set of revset operators and quotes in bytes 2017-03-12 17:13:54 -07:00
Yuya Nishihara
b72ea4927a py3: convert set of revset initial symbols back to bytes
Otherwise tokenize() would fail due to comparison between unicode and bytes.
2017-03-12 17:10:14 -07:00
Yuya Nishihara
a1b53adeff pycompat: add helper to iterate each char in bytes 2017-03-12 17:04:45 -07:00
Augie Fackler
38e6574e36 branchmap: fix python 2.6 by using util.buffer() instead of passing bytearray 2017-03-12 19:47:51 -04:00
Mads Kiilerich
ea3cfbc6fb merge: check current wc branch for 'nothing to merge', not its p1
The working directory will usually be clean or very clean, and wc will usually
have the same branch as its parent. This change will thus usually not make any
difference and is done as a separate change to show that. It will be used in a
later change.
2017-03-12 16:41:46 -07:00
Yuya Nishihara
1aa2b14401 lock: do not encode result of gethostname on Python 2
If a hostname contained non-ascii character, str.encode() would first try
to decode it to a unicode and raise UnicodeDecodeError.
2017-03-12 16:26:34 -07:00
Augie Fackler
3c69b88530 lock: encode result of gethostname into a bytestring 2017-03-12 03:28:50 -04:00
Martijn Pieters
458cf59837 config: avoid using a mutable default
Nothing *currently* mutates this list, but the moment something does it'll be
shared between all config instances. Avoid this eventuality.
2017-03-12 12:56:12 -07:00
Pierre-Yves David
c6f71ed49c localrepo: deprecate 'repo.join' in favor of 'repo.vfs.join'
localrepo have an insane amount of method. Accessing the feature through the
vfs is not really harder and allow us to schedule that method for removal.
2016-08-05 14:09:04 +02:00
Yuya Nishihara
7daa87b335 pycompat: move imports of cStringIO/io to where they are used
There's no point to import cStringIO as io since we have to select StringIO
or BytesIO conditionally.
2017-03-12 12:54:11 -07:00
Mads Kiilerich
d97e14e32b rbc: empty (and invalid) rbc-names file should give an empty name list
An empty file (if it somehow should exist) used to give a list with an empty
name. That didn't do any harm, but it was "wrong". Fix that.
2017-03-12 12:17:30 -07:00
Mads Kiilerich
d6292de3bd rbc: use struct unpack_from and pack_into instead of unpack and pack
These functions were introduced in Python 2.5 and are faster and simpler than
the old ones ...  mainly because we can avoid intermediate buffers:

  $ python -m timeit -s "_rbcrecfmt='>4sI'" -s 's = "x"*10000' -s 'from struct import unpack' 'unpack(_rbcrecfmt, buffer(s, 16, 8))'
  1000000 loops, best of 3: 0.543 usec per loop
  $ python -m timeit -s "_rbcrecfmt='>4sI'" -s 's = "x"*10000' -s 'from struct import unpack_from' 'unpack_from(_rbcrecfmt, s, 16)'
  1000000 loops, best of 3: 0.323 usec per loop

  $ python -m timeit -s "from array import array" -s "_rbcrecfmt='>4sI'" -s "s = array('c')" -s 's.fromstring("x"*10000)' -s 'from struct import pack' -s "rec = array('c')" 'rec.fromstring(pack(_rbcrecfmt, "asdf", 7))'
  1000000 loops, best of 3: 0.364 usec per loop
  $ python -m timeit -s "from array import array" -s "_rbcrecfmt='>4sI'" -s "s = array('c')" -s 's.fromstring("x"*10000)' -s 'from struct import pack_into' -s "rec = array('c')" -s 'rec.fromstring("x"*100)' 'pack_into(_rbcrecfmt, rec, 0, "asdf", 7)'
  1000000 loops, best of 3: 0.229 usec per loop
2016-10-19 02:46:35 +02:00
Augie Fackler
bc09440907 revlog: use bytes() instead of str() to get data from memoryview
Fixes `files -v` on Python 3.
2017-03-12 15:27:02 -04:00
Augie Fackler
d5d09dfdf4 util: teach url object about __bytes__
__str__ tries to do something reasonable, but someone else more
familiar with encoding bugs should check my work.
2017-03-12 03:33:22 -04:00
Augie Fackler
408bc8a668 manifest: ensure paths are bytes (not str) in pure parser 2017-03-12 03:31:54 -04:00
Augie Fackler
518fdf5357 manifest: now that node.bin is available, use it directly
Previously we were getting it through revlog, which is a little unusual.
2017-03-12 03:30:15 -04:00
Augie Fackler
c6a7c91d01 manifest: use node.bin instead of .decode('hex')
The latter doesn't work in Python 3.
2017-03-12 03:29:48 -04:00
Augie Fackler
93c6a91a94 manifest: add __next__ methods for Python 3
Python 3 renamed .next() in the iterator protocol to __next__().
2017-03-12 00:43:20 -05:00
Augie Fackler
f010a9d8ef files: use native string type to load rev opt from dict 2017-03-12 00:51:00 -05:00
Augie Fackler
8c2dacbd98 store: fix many single-byte ops to use slicing in _auxencode 2017-03-12 00:50:44 -05:00
FUJIWARA Katsunori
9d450170ba py3: add "b" prefix to string literals related to module policy
String literals without explicit prefix in __init__.py and policy.py
are treated as unicode object on Python3, because these modules are
loaded before setup of our specific code transformation (the later
module is imported at the beginning of __init__.py).

BTW, "modulepolicy" in __init__.py is initialized by "policy.policy".

This causes issues below;

  - checking "policy" value in other modules causes unintentional result

    For example, "b'py' not in (u'c', u'py')" returns True
    unintentionally on Python3.

  - writing "policy" out fails at conversion from unicode to bytes

    db1ebf457295 fixed this issue for default code path, but "policy"
    can be overridden by HGMODULEPOLICY environment variable (it should
    be rare case for developer using Python3, though).

This patch does:

  - add "b" prefix to all string literals, which are related to module
    policy, in modules above.

  - check existence of HGMODULEPOLICY, and overwrite "policy" only if
    it exists

    For simplicity, this patch omits checking "supports_bytes_environ",
    switching os.environ/os.environb, and so on (Yuya agreed this in
    personal talking)
2017-03-13 04:06:36 +09:00
Yuya Nishihara
bec7ade60c py3: drop unused aliases to array.array which are replaced with bytearray 2017-03-12 11:47:02 -07:00
Pulkit Goyal
a0c31269e8 pycompat: default to BytesIO instead of StringIO 2017-03-13 00:55:14 +05:30
Augie Fackler
87aee9cbf5 repoview: specify setattr values as native strings 2017-03-12 00:48:06 -05:00
Augie Fackler
03a50eb15f revlog: use bytes() to ensure text from _chunks is a reasonable type 2017-03-12 03:32:38 -04:00
Augie Fackler
58dedd9fd0 revlog: extract first byte of revlog with a slice so it's portable 2017-03-12 00:49:49 -05:00
Augie Fackler
9d7c26df45 revsetlang: slice out single bytes instead of indexing
For portability with Python 3.
2017-03-12 00:46:59 -05:00
Augie Fackler
b4f8ffef60 lock: use %d to format integer into a bytestring 2017-03-12 03:29:04 -04:00
Augie Fackler
ef815f4375 parser: use %d instead of %s for interpolating error position
Error position is an int, so we should use %d instead of %s. Fixes
failures on Python 3.
2017-03-12 00:44:59 -05:00
Augie Fackler
949dee72f1 manifest: unbreak pure-python manifest parsing on Python 3 2017-03-12 00:44:21 -05:00
Augie Fackler
d214daa434 context: use portable construction to verify int parsing 2017-03-12 00:43:47 -05:00
Augie Fackler
5e07b24e52 ui: portably bytestring-ify url object 2017-03-12 01:59:23 -05:00
Augie Fackler
edad90c687 scmutil: fix key generation to portably bytestringify integer 2017-03-12 00:47:39 -05:00
Augie Fackler
9c70a09b17 branchmap: stringify int in a portable way
We actually need a bytes in Python 3, and thanks to our nasty source
loader this will portably do the right thing.
2017-03-12 00:42:46 -05:00
Augie Fackler
0c31289213 branchmap: don't use buffer() on Python 3
This is certainly slower than the Python 2 code, but it works, and we
can revisit it later if it's a problem.
2017-03-12 00:49:19 -05:00
Augie Fackler
9a15a28705 py3: use bytearray() instead of array('c', ...) constructions
Portable from 2.6-3.6.
2017-03-12 03:32:21 -04:00
Augie Fackler
b9f0d10d43 summary: don't explicitly str() something we're about to %s
str() is wrong on Python 3 here, and %s implicitly calls str() anyway,
so this was just extra dancing for no reason.
2017-03-11 20:58:26 -05:00
Augie Fackler
d44c41fe19 context: implement both __bytes__ and __str__ for Python 3
They're very similar, for obvious reasons.
2017-03-11 20:57:40 -05:00
Augie Fackler
89600a72c4 context: work around long not existing on Python 3
I can't figure out what this branch is even trying to accomplish, and
it was introduced in 387a3aa50d61 which doesn't really shed any
insight into why longs are treated differently from ints.
2017-03-11 20:57:04 -05:00
Augie Fackler
f080be2c20 phases: explicitly evaluate list returned by map
On Python 3 map() returns a generator, which bool()s to true even if
it had an empty input set. Work around this by using list() on the
map() result.
2017-03-11 20:53:20 -05:00
Augie Fackler
6ba88e41e4 ui: check for --debugger in sys.argv using r-string to avoid bytes on py3
Our source loader was errantly turning this --debugger into a bytes,
which was then causing me to still get a pager when I was using the
debugger on py3. That made life hard.
2017-03-11 20:51:09 -05:00
Pulkit Goyal
f34f53b9de minirst: use bytes.strip instead of str.strip
bytes.strip exists in Python 2.6 and Python 2.7 also.
2017-03-12 22:46:57 +05:30
Pulkit Goyal
48edb15e9c smcposix: pass unicode as first argument to array.array
This is an instance where we can safely convert the first argument, rest are
the cases except one where we are using 'c' which is not there in Python 3. So
that needs to be handled differently. This will help in making `hg help` run on
Python 3.
2017-03-12 22:27:53 +05:30
Pulkit Goyal
7deacd3d03 util: pass encoding.[encoding|encodingmode] as unicodes
We need to pass str to encode() and decode().
2017-03-12 07:35:13 +05:30
Pierre-Yves David
5e62e32b2e subrepo: directly use repo.vfs.join
The 'repo.join' method is about to be deprecated.
2017-03-08 16:53:47 -08:00
Pierre-Yves David
197ab7aeb0 repair: directly use repo.vfs.join
The 'repo.join' method is about to be deprecated.
2017-03-08 16:53:39 -08:00
Pierre-Yves David
98f81e8c4f merge: directly use repo.vfs.join
The 'repo.join' method is about to be deprecated.
2017-03-08 16:53:32 -08:00
Pierre-Yves David
de20776881 hg-mod: directly use repo.vfs.join
The 'repo.join' method is about to be deprecated.
2017-03-08 16:53:24 -08:00
Pierre-Yves David
d47e9585d6 commands: directly use repo.vfs.join
The 'repo.join' method is about to be deprecated.
2017-03-08 16:53:17 -08:00
Pierre-Yves David
80b1f7c309 cmdutil: directly use repo.vfs.join
The 'repo.join' method is about to be deprecated.
2017-03-08 16:53:09 -08:00
Pierre-Yves David
b71c55108c localrepo: directly use repo.vfs.join
The 'repo.join' method is about to be deprecated.
2016-08-05 14:29:22 +02:00
Pulkit Goyal
077cba9952 minirst: make encoding.encoding unicodes to pass into encode() and decode() 2017-03-12 07:09:18 +05:30
Pulkit Goyal
bd7d2c3f64 minirst: make regular expressions bytes 2017-03-12 06:59:37 +05:30
Yuya Nishihara
a7a60a2e43 revset: drop TODO comment about sorting issue of fullreposet
The bootstrapping issue was addressed at the parsing phase and we expect
that fullreposet.__and__() fully complies to the smartset API, in which
'self & other' should return a result set in self's order. See also
ab938e7ae803.
2016-05-14 20:52:44 +09:00
Yuya Nishihara
2fa6a1e65e revset: document wdir() as an experimental function
Let's resurrect the docstring since our help module can detect the EXPERIMENTAL
tag and display it only if -v is specified.

This patch updates the test added by bbdfa2d5aaa2 since wdir() is now
documented.
2017-01-05 22:53:42 +09:00
Yuya Nishihara
ec99971228 revset: categorize wdir() as very fast function
The cost of wdir() should be identical to or cheaper than _intlist().
2016-08-20 17:50:23 +09:00
Yuya Nishihara
14fa3ba925 revset: make children() not look at p2 if null (issue5439)
Unlike p1 = null, p2 = null denotes the revision has only one parent, which
shouldn't be considered a child of the null revision. This was spotted while
fixing the issue4682 and rediscovered as issue5439.
2015-05-23 11:04:11 +09:00
Augie Fackler
067ebafd12 merge with stable 2017-01-04 14:52:59 -05:00
Denis Laxalde
d53254ecde templates-default: factor out definition of changeset labels
This is redundant for normal and debug mode and prepares extension of this
list that should effect both modes.
2017-01-03 13:25:29 +01:00
Yuya Nishihara
c175ab72eb posix: make poll() restart on interruption by signal (issue5452)
select() is a notable example of syscalls which may fail with EINTR. If we
had a SIGWINCH handler installed, ssh would crash when the terminal window
was resized. This patch fixes the problem.
2016-12-22 23:14:13 +09:00
Yuya Nishihara
3379250232 demandimport: do not raise ImportError for unknown item in fromlist
This is the behavior of the default __import__() function, which doesn't
validate the existence of the fromlist items. Later on, the missing attribute
is detected while processing the import statement.

https://hg.python.org/cpython/file/v2.7.13/Python/import.c#l2575

The comtypes library relies on this (maybe) undocumented behavior, and we
got a bug report to TortoiseHg, sigh.

https://bitbucket.org/tortoisehg/thg/issues/4647/

The test added at 0be19b069edf verifies the behavior of the import statement,
so this patch only adds the test of __import__() function and works around
CPython/PyPy difference.
2016-12-19 22:46:00 +09:00
Anton Shestakov
dc9f869036 hgweb: add missing slash to file log url in rss style 2016-12-08 23:59:36 +08:00
FUJIWARA Katsunori
367ebf8ba3 scmutil: ignore EPERM at os.utime, which avoids ambiguity at closing
According to POSIX specification, just having group write access to a
file causes EPERM at invocation of os.utime() with an explicit time
information (e.g. working on the repository shared by group access
permission).

To ignore EPERM at closing file object in such case, this patch makes
checkambigatclosing._checkambig() use filestat.avoidambig() introduced
by previous patch.

Some functions below imply this code path at truncation of an existing
(= might be owned by another user) file.

  - strip() in repair.py, introduced by 4d0a08431b6f
  - _playback() in transaction.py, introduced by 48fe04792102

This is a variant of issue5418.
2016-11-13 06:12:22 +09:00
FUJIWARA Katsunori
11742ce806 vfs: ignore EPERM at os.utime, which avoids ambiguity at renaming (issue5418)
According to POSIX specification, just having group write access to a
file causes EPERM at invocation of os.utime() with an explicit time
information (e.g. working on the repository shared by group access
permission).

To ignore EPERM at renaming in such case, this patch makes
vfs.rename() use filestat.avoidambig() introduced by previous patch.
2016-11-13 06:11:56 +09:00
FUJIWARA Katsunori
64644e300c util: add utility function to skip avoiding file stat ambiguity if EPERM
Now, advancing stat.st_mtime by os.utime() is used to avoid file stat
ambiguity. But according to POSIX specification, utime(2) with an
explicit time information is permitted only for a process with:

  - the effective user ID equal to the user ID of the file, or
  - appropriate privileges

  http://pubs.opengroup.org/onlinepubs/9699919799/functions/utime.html

Therefore, just having group write access to a file causes EPERM at
applying os.utime() on it (e.g. working on the repository shared by
group access permission).

This patch adds class filestat utility function avoidamgig() to avoid
file stat ambiguity but skip it if EPERM.

It is reasonable to always ignore EPERM, because utime(2) causes EPERM
only in the case described above (EACCES is used only for utime(2)
with NULL).
2016-11-13 06:06:23 +09:00
Gregory Szorc
085fa86140 hgweb: cache fctx.parents() in annotate command (issue5414)
43e3fb1c484e introduced a call to fctx.parents() for each line in
annotate output. This function call isn't cheap, as it requires
linkrev adjustment.

Since multiple lines in annotate output tend to belong to the same
file revision, a cache of fctx.parents() lookups for each input
should be effective in the common case. So we implement one.

Since the cache has to precompute parents so an aborted generator
doesn't leave an incomplete cache, we could just return a list.
However, we preserve the generator for backwards compatibility.

The effect of this change when requesting /annotate/96ca0ecdcfa/
browser/locales/en-US/chrome/browser/downloads/downloads.dtd on
the mozilla-aurora repo is significant:

p1(43e3fb1c484e)  5.5s
43e3fb1c484e:    66.3s
this patch:      10.8s

We're still slower than before. But only by ~2x instead of ~12x.

On the tip revisions of layout/base/nsCSSFrameConstructor.cpp file in
the mozilla-unified repo, time went from 12.5s to 14.5s and back to
12.5s. I'm not sure why the mozilla-aurora repo is so slow.

Looking at the code of basefilectx.parents(), there is room for
further improvements. Notably, we still perform redundant calls to
filelog.renamed() and basefilectx._parentfilectx(). And
basefilectx.annotate() also makes similar calls, so there is potential
for object reuse. However, introducing caches here are not appropriate
for the stable branch.
2016-11-05 09:38:07 -07:00
Nathan Goldbaum
cd41ee4190 tag: clarify warning about making a tag on a branch head
Currently the warning is ambiguous about whether the new tag (possibly specified
via --rev) is being added on a branch head or whether the working directory is
based on a branch head. Clarify the error message to eliminate this ambiguity.
2016-10-31 17:12:32 -05:00
FUJIWARA Katsunori
38ad72f729 help: replace selenic.com by mercurial-scm.org in man pages
Source code repository and mailing list services have been already
migrated to mercurial-scm.org domain.
2016-11-01 20:39:36 +09:00
FUJIWARA Katsunori
15640c5749 help: replace selenic.com by mercurial-scm.org in command examples
Source code repository service of Mercurial itself has been already
migrated to mercurial-scm.org domain.
2016-11-01 20:39:35 +09:00
Mads Kiilerich
40ab99f130 httppeer: make __del__ access to self.urlopener more safe
Some errors could in some cases show unfortunate scary and confusing warnings
from the httppeer delstructors:

  abort: nodename nor servname provided, or not known
  Exception AttributeError: "'httpspeer' object has no attribute 'urlopener'" in <bound method httpspeer.__del__ of <mercurial.httppeer.httpspeer object at 0x106e1f5d0>> ignored```

To mute that, take 8bdb0bb8e209 to the next level and use getattr in __del__.
2016-10-31 13:43:48 +01:00
Yuya Nishihara
01ff276025 templater: use unfiltered changelog to calculate shortest() at constant time
cl._partialmatch() can be pretty slow if hidden revisions are involved. This
patch cancels the slowdown introduced by the previous patch by using an
unfiltered changelog, which means shortest(node) isn't always the shortest.

The result isn't perfect, but seems okay as long as shortest(node) is short
enough to type and can be used as an identifier.

  (with hidden revisions)
  % hg log -R hg-committed -r0:20000 -T '{node|shortest}\n' --time > /dev/null
  (.^^) time: real 1.530 secs (user 1.480+0.000 sys 0.040+0.000)
  (.^)  time: real 43.080 secs (user 43.060+0.000 sys 0.030+0.000)
  (.)   time: real 1.680 secs (user 1.650+0.000 sys 0.020+0.000)
2016-10-25 21:49:30 +09:00
Yuya Nishihara
35fcce9afc templater: do not use index.partialmatch() directly to calculate shortest()
cl.index.partialmatch() isn't a drop-in replacement for cl._partialmatch().
It has no knowledge about hidden revisions, and it raises ValueError if a node
shorter than 4 chars is given. Instead, use index.partialmatch() through
cl._partialmatch(), which has no such problems and gives the identical result
with/without --pure.

The test output was sampled with --pure without this patch, which shows the
most correct result. However, we'll need to switch to using an unfiltered
changelog because _partialmatch() of a filtered changelog can be an order of
magnitude slower.

  (with hidden revisions)
  % hg log -R hg-committed -r0:20000 -T '{node|shortest}\n' --time > /dev/null
  (.^)  time: real 1.530 secs (user 1.480+0.000 sys 0.040+0.000)
  (.)   time: real 43.080 secs (user 43.060+0.000 sys 0.030+0.000)
2016-10-23 14:05:23 +09:00
Gábor Stefanik
5533b05a12 merge: avoid superfluous filemerges when grafting through renames (issue5407)
This is a fix for a regression introduced by the patches for issue4028.

The test changes are due to us doing fewer _checkcopies searches now, which
makes some test outputs revert to the pre-issue4028 behavior. That issue itself
remains fixed, we only skip copy tracing for files where it isn't relevant.
As a nice side effect, this makes copy detection much faster when tracing
backwards through lots of renames.
2016-10-25 21:01:53 +02:00
Gábor Stefanik
e9b2eb13b5 sslutil: guard against broken certifi installations (issue5406)
Certifi is currently incompatible with py2exe; the Python code for certifi gets
included in library.zip, but not the cacert.pem file - and even if it were
included, SSLContext can't load a cacert.pem file from library.zip.
This currently makes it impossible to build a standalone Windows version of
Mercurial.

Guard against this, and possibly other situations where a module with the name
"certifi" exists, but is not usable.
2016-10-19 18:06:14 +02:00
Mads Kiilerich
b4b748a9ed revset: don't cache abstractsmartset min/max invocations infinitely
There was a "leak", apparently introduced in b37a67b41690. When running:

    hg = hglib.open('repo')
    while True:
        hg.log("max(branch('default'))")

all filteredset instances from branch() would be cached indefinitely by the
@util.cachefunc annotation on the max() implementation.

util.cachefunc seems dangerous as method decorator and is barely used elsewhere
in the code base. Instead, just open code caching by having the min/max
methods replace themselves with a plain lambda returning the result.
2016-10-25 18:56:27 +02:00
Mads Kiilerich
a2ea53b6ed dirstate: fix debug.dirstate.delaywrite to use the new "now" after sleeping
It seems like the a regression has sneaked into debug.dirstate.delaywrite in
14bddc099338. It would sleep until no files were modified "now" any more, but
when writing the dirstate it would use the old "now" and still mark files as
'unset' instead of recording the timestamp that would make the file show up as
clean instead of unknown.

Instead of getting a new "now" from the file system, we trust the computed end
time as the new "now" and thus cause the actual modification time to be
writiten to the dirstate.

debug.dirstate.delaywrite is undocumented and only used in
test-largefiles-update.t . All tests seems to work fine for me without
debug.dirstate.delaywrite . Perhaps because it not really worked as intended
without the fix in this patch, and code and tests thus have evolved to do fine
without it? It could thus perhaps make sense to drop usage of this setting in
the tests. That could speed the test up a bit.

This functionality (or something very similar) can however apparently be very
convenient in setups where checking dirty-ness is expensive - such as when
using large files and have slow file filesystems or are CPU constrained. Now it
works and we can try it. (But ideally, for the largefile use case, it should
probably only delay lfdirstate writes - not ordinary dirstate.)
2016-10-18 16:52:35 +02:00
Gregory Szorc
3f32afbd84 commands: print security protocol support in debuginstall
Over the past week I've had to instruct multiple people to run
Python code to query the ssl module to see what TLS protocol support
is present. I think it would be useful for `hg debuginstall` to print
this info to make it easier to access and debug why Mercurial is
complaining about using an insecure TLS 1.0 protocol.

Ideally we'd also print the path to the CA cert bundle. But the APIs
for querying that in sslutil can emit warnings, making it slightly
more difficult to integrate into `hg debuginstall`. That work will
have to wait for another day.
2016-10-19 15:07:11 -07:00
Durham Goode
9fcac302ea manifest: make treemanifestctx store the repo
Same as in the last commit, the old treemanifestctx stored a reference to the
revlog.  If the inmemory revlog became invalid, the ctx now held an old copy and
would be incorrect. To fix this, we need the ctx to go through the manifestlog
for each access.

This is the same pattern that changectx already uses (it stores the repo, and
accesses commit data through self._repo.changelog).
2016-10-18 17:44:42 -07:00
Durham Goode
46fbc1bfc1 manifest: make manifestctx store the repo
The old manifestctx stored a reference to the revlog. If the inmemory revlog
became invalid, the ctx now held an old copy and would be incorrect. To fix
this, we need the ctx to go through the manifestlog for each access.

This is the same pattern that changectx already uses (it stores the repo, and
accesses commit data through self._repo.changelog).
2016-10-18 17:44:26 -07:00
Durham Goode
871d515e3d manifest: make manifestlog a storecache
The old @property on manifestlog was broken. It meant that we would always
recreate the manifestlog instance, which meant the cache was never hit. Since
we'll eventually remove repo.manifest and make manifestlog the only property,
let's go ahead and make manifestlog the @storecache property, have manifestlog
own the manifest instance, and have repo.manifest refer to it via manifestlog.

This means all accesses go through repo.manifestlog, which is now invalidated
correctly.
2016-10-18 17:33:39 -07:00
Durham Goode
757b6fb5aa manifest: move manifest creation to a helper function
A future patch will be moving manifest creation to be inside manifestlog as part
of improving our cache guarantees. bundlerepo and unionrepo currently rely on
being able to hook into manifest creation, so let's temporarily move the actual
manifest creation to a helper function for them to intercept.

In the future manifest.manifest() will disappear entirely and this can
disappear.
2016-10-18 17:32:51 -07:00
Gregory Szorc
722900ff91 changegroup: increase write buffer size to 128k
By default, Python defers to the operating system for choosing the
default buffer size on opened files. On my Linux machine, the default
is 4k, which is really small for 2016.

This patch bumps the write buffer size when writing
changegroups/bundles to 128k. This matches the 128k read buffer
we already use on revlogs.

It's worth noting that this only impacts when writing to an explicit
file (such as during `hg bundle`). Buffers when writing to bundle
files via the repo vfs or to a temporary file are not impacted.

When producing a none-v2 bundle file of the mozilla-unified repository,
this change caused the number of write() system calls to drop from
952,449 to 29,788. After this change, the most frequent system
calls are fstat(), read(), lseek(), and open(). There were
2,523,672 system calls after this patch (so a net decrease of
~950k is statistically significant).

This change shows no performance change on my system. But I have a
high-end system with a fast SSD. It is quite possible this change
will have a significant impact on network file systems, where
extra network round trips due to excessive I/O system calls could
introduce significant latency.
2016-10-16 13:35:23 -07:00
Pierre-Yves David
667d10975b changegroup: skip delta when the underlying revlog do not use them
Revlog can now be configured to store full snapshot only. This is used on the
changelog. However, the changegroup packing was still recomputing deltas to be
sent over the wire.

We now just reuse the full snapshot directly in this case, skipping delta
computation. This provides use with a large speed up(-30%):

# perfchangegroupchangelog on mercurial
! wall 2.010326 comb 2.020000 user 2.000000 sys 0.020000 (best of 5)
! wall 1.382039 comb 1.380000 user 1.370000 sys 0.010000 (best of 8)

# perfchangegroupchangelog on pypy
! wall 5.792589 comb 5.780000 user 5.780000 sys 0.000000 (best of 3)
! wall 3.911158 comb 3.920000 user 3.900000 sys 0.020000 (best of 3)

# perfchangegroupchangelog on mozilla central
! wall 20.683727 comb 20.680000 user 20.630000 sys 0.050000 (best of 3)
! wall 14.190204 comb 14.190000 user 14.150000 sys 0.040000 (best of 3)

Many tests have to be updated because of the change in bundle content. All
theses update have been verified.  Because diffing changelog was not very
valuable, the resulting bundle have similar size (often a bit smaller):

# full bundle of mozilla central
with delta:    1142740533B
without delta: 1142173300B

So this is a win all over the board.
2016-10-14 01:31:11 +02:00
Pierre-Yves David
b03bd97b6a revlog: make 'storedeltachains' a "public" attribute
The next changeset will make that attribute read by the changegroup packer. We
make it "public" beforehand.
2016-10-14 02:25:08 +02:00
Martin von Zweigbergk
5ebaaf2902 manifest: don't store None in fulltextcache
When we read a value from fulltextcache, we expect it to be an array,
so we should not store None in it. Found while working on narrowhg.
2016-10-17 22:51:22 -07:00
Gábor Stefanik
14dc42e666 copies: improve assertions during copy recombination
- Make sure there is nothing to recombine in non-graftlike scenarios
- More pythonic assert syntax
2016-10-18 02:09:08 +02:00
Martin von Zweigbergk
126f9b1a2d treemanifest: fix bad argument order to treemanifestctx
Found by running tests with _treeinmem (both of them) modified to be
True.
2016-10-17 16:12:12 -07:00
Gregory Szorc
1538b87cfc wireproto: compress data from a generator
Currently, the "getbundle" wire protocol command obtains a generator of
data, converts it to a util.chunkbuffer, then converts it back to a
generator via the protocol's groupchunks() implementation. For the SSH
protocol, groupchunks() simply reads 4kb chunks then write()s the
data to a file descriptor. For the HTTP protocol, groupchunks() reads
32kb chunks, feeds those into a zlib compressor, emits compressed data
as it is available, and that is sent to the WSGI layer, where it is
likely turned into HTTP chunked transfer chunks as is or further
buffered and turned into a larger chunk.

For both the SSH and HTTP protocols, there is inefficiency from using
util.chunkbuffer.

For SSH, emitting consistent 4kb chunks sounds nice. However, the file
descriptor it is writing to is almost certainly buffered. That means
that a Python .write() probably doesn't translate into exactly what is
written to the I/O layer.

For HTTP, we're going through an intermediate layer to zlib compress
data. So all util.chunkbuffer is doing is ensuring that the chunks we
feed into the zlib compressor are of uniform size. This means more CPU
time in Python buffering and emitting chunks in util.chunkbuffer but
fewer function calls to zlib.

This patch introduces and implements a new wire protocol abstract
method: compresschunks(). It is like groupchunks() except it operates
on a generator instead of something with a .read(). The SSH
implementation simply proxies chunks. The HTTP implementation uses
zlib compression.

To avoid duplicate code, the HTTP groupchunks() has been reimplemented
in terms of compresschunks().

To prove this all works, the "getbundle" wire protocol command has been
switched to compresschunks(). This removes the util.chunkbuffer from
that command. Now, data essentially streams straight from the
changegroup emitter to the wire, possibly through a zlib compressor.
Generators all the way, baby.

There were slim to no performance changes on the server as measured
with the mozilla-central repository. This is likely because CPU
time is dominated by reading revlogs, producing the changegroup, and
zlib compressing the output stream. Still, this brings us a little
closer to our ideal of using generators everywhere.
2016-10-16 11:10:21 -07:00
Mads Kiilerich
20a4281d3a revset: optimize for destination() being "inefficient"
destination() will scan through the whole subset and read extras for each
revision to get its source.
2016-10-17 19:48:36 +02:00
Gábor Stefanik
2f48be6841 copies: make _checkcopies handle copy sequences spanning the TCA (issue4028)
When working in a rotated DAG (for a graftlike merge), there can be files
that are renamed both between the base and the topological CA, and between
the TCA and the endpoint farther from the base. Such renames span the TCA
(and thus need both passes of _checkcopies to be fully detected), but may
not necessarily be divergent.

Make _checkcopies return "incomplete copies" and "incomplete divergences"
in this case, and let mergecopies recombine them once data from both passes
of _checkcopies is available.

With this patch, all known cases involving renames and grafts pass.

(Developed together with Pierre-Yves David)
2016-10-11 04:39:47 +02:00
Gábor Stefanik
c03c8792d5 checkcopies: add logic to handle remotebase
As the two _checkcopies passes' ranges are separated by tca, not base,
only one of the two passes will actually encounter the base.
Pass "remotebase" to the other pass to let it know not to expect passing
over the base. This is required for handling a few unusual rename cases.
2016-10-11 04:25:59 +02:00
Denis Laxalde
37fb9b1045 cmdutil: add support for evolution "troubles" display in changeset_printer
Add a "trouble" line in changeset header along with a couple of labels on
"log.changeset" line to indicate whether a changeset is troubled or not and
which kind trouble occurs.
2016-10-10 12:06:58 +02:00
Denis Laxalde
1f19429796 cmdutil: extract a _changesetlabels function out of changeset_printer._show()
There is a common logic in changeset_printer and in the summary command for
labelling a changeset.

This prepares extension of changeset's labels with evolution "troubles"
information that would show up in both log and summary outputs. Ultimately,
both would use this function.
2017-01-03 10:56:41 +01:00
Gábor Stefanik
912f58ada1 mergecopies: add logic to process incomplete data
We first combine incomplete copies on the two sides of the topological CA
into complete copies.
Any leftover incomplete copies are then combined with the incomplete
divergences to reconstruct divergences spanning over the topological CA.
Finally we promote any divergences falsely flagged as incomplete to full
divergences.

Right now, there is nothing generating incomplete copy/divergence data,
so this code does nothing. Changes to _checkcopies to populate these
dicts are coming later in this series.
2016-10-04 12:51:54 +02:00
Gábor Stefanik
242c4897e8 checkcopies: handle divergences contained entirely in tca::ctx
During a graftlike merge, _checkcopies runs from ctx to tca, possibly
passing over the merge base. If there is a rename both before and after
the base, then we're actually dealing with divergent renames.
If there is no rename on the other side of tca, then the divergence is
contained entirely in the range of one _checkcopies invocation, and
should be detected "in the loop" without having to rely on the other
_checkcopies pass.
2016-10-12 11:54:03 +02:00
Gábor Stefanik
a47c5119e6 update: enable copy tracing for backwards and non-linear updates
As a followup to the issue4028 series, this fixes a variant of the issue
that can occur when updating with uncommited local changes.

The duplicated .hgsub warning is coming from wc.dirty(). We would previously
skip this call because it's only relevant when we're going to perform copy
tracing, which we didn't do before.

The change to the update summary line is because we now treat the rename as a
proper rename (which counts as a change), rather than an add+delete pair
(which counts as a change and a delete).
2016-08-25 22:02:26 +02:00
Gábor Stefanik
d967d939d6 mergecopies: invoke _computenonoverlap for both base and tca during merges
The algorithm of _checkcopies can only walk backwards in the DAG, never
forward. Because of this, the two _checkcopies patches need to run from
their respective endpoints to the TCA to cover the entire subgraph where
the merge is being performed. However, detection of files new in both
endpoints, as well as directory rename detection, need to run with respect
to the merge base, so we need lists of new files both from the TCA's and
the merge base's viewpoint to correctly detect renames in a graft-like
merge scenario.

(Series reworked by Pierre-Yves David)
2016-10-13 02:19:43 +02:00
Pierre-Yves David
cce3e9c3ad copies: make it possible to distinguish betwen _computenonoverlap invocations
_computenonoverlap needs to be invoked twice during a graft, and debugging
messages should be distinguishable between the two invocations
2016-10-18 00:00:43 +02:00
Gábor Stefanik
7730b47e09 copies: make _checkcopies handle simple renames in a rotated DAG
This introduces a distinction between "merge base" and
"topological common ancestor". During a regular merge, these two are
identical. Graft, however, performs a merge in a rotated DAG, where the
merge base will not be a common ancestor at all in the
original DAG.

To correctly find copies in case of a graft, we need to take both the
merge base and the topological CA into account, and track any renames
between them in reverse. Fortunately we can detect this in advance,
see comment in the code about "backwards".

This patch only supports finding non-divergent renames contained entirely
between the merge base and the topological CA. Further patches are coming
to support more complex cases.

(Pierre-Yves David was involved in the cleanup of this patch.)
2016-10-13 02:03:54 +02:00
Gábor Stefanik
60bab1ec6c copies: compute a suitable TCA if base turns out to be unsuitable
This will be used later in an update to _checkcopies.

(Pierre-Yves David was involved in the cleanup of this patch.)
2016-10-13 02:03:49 +02:00
Gábor Stefanik
6250f7ff54 copies: detect graft-like merges
Right now, nothing changes as a result of this, but we want to handle
grafts differently from ordinary merges later.

(Series developed together with Pierre-Yves David)
2016-10-13 01:47:33 +02:00
Gábor Stefanik
4adc2f1a6a checkcopies: add a sanity check against false-positive copies
When grafting a copy backwards through a rename, a copy is wrongly detected,
which causes the graft to be applied inappropriately, in a destructive way.
Make sure that the old file name really exists in the common ancestor,
and bail out if it doesn't.

This fixes the aggravated case of bug 5343, although the basic issue
(failure to duplicate the copy information) still occurs.
2016-10-12 21:33:45 +02:00
Gregory Szorc
26f6f03d4c exchange: refactor APIs to obtain bundle data (API)
Currently, exchange.getbundle() returns either a cg1unpacker or a
util.chunkbuffer (in the case of bundle2). This is kinda OK, as
both expose a .read() to consumers. However, localpeer.getbundle()
has code inferring what the response type is based on arguments and
converts the util.chunkbuffer returned in the bundle2 case to a
bundle2.unbundle20 instance. This is a sign that the API for
exchange.getbundle() is not ideal because it doesn't consistently
return an "unbundler" instance.

In addition, unbundlers mask the fact that there is an underlying
generator of changegroup data. In both cg1 and bundle2, this generator
is being fed into a util.chunkbuffer so it can be re-exposed as a
file object.

util.chunkbuffer is a nice abstraction. However, it should only be
used "at the edges." This is because keeping data as a generator is
more efficient than converting it to a chunkbuffer, especially if we
convert that chunkbuffer back to a generator (as is the case in some
code paths currently).

This patch refactors exchange.getbundle() into
exchange.getbundlechunks(). The new API returns an iterator of chunks
instead of a file-like object.

Callers of exchange.getbundle() have been updated to use the new API.

There is a minor change of behavior in test-getbundle.t. This is
because `hg debuggetbundle` isn't defining bundlecaps. As a result,
a cg1 data stream and unpacker is being produced. This is getting fed
into a new bundle20 instance via bundle2.writebundle(), which uses
a backchannel mechanism between changegroup generation to add the
"nbchanges" part parameter. I never liked this backchannel mechanism
and I plan to remove it someday. `hg bundle` still produces the
"nbchanges" part parameter, so there should be no user-visible
change of behavior. I consider this "regression" a bug in
`hg debuggetbundle`. And that bug is captured by an existing
"TODO" in the code to use bundle2 capabilities.
2016-10-16 10:38:52 -07:00
Pierre-Yves David
604c8243a9 mergecopies: rename 'ca' to 'base'
This variable was named after the common ancestor. It is actually the merge
base that might differ from the common ancestor in the graft case. We rename the
variable before a larger refactoring to clarify the situation. Similar rename
was also applied to 'checkcopies' in a prior changeset.
2016-10-13 01:30:14 +02:00
Pierre-Yves David
4eabc75da9 copies: move variable document from checkcopies to mergecopies
It appears that 'mergecopies' is the function consuming these data so we move
the documentation there.
2016-10-13 01:26:33 +02:00
Pierre-Yves David
9df147eb63 checkcopies: pass data as a dictionary of dictionaries
more are coming
2016-10-11 02:21:42 +02:00
Pierre-Yves David
80ed73689f checkcopies: move 'movewithdir' initialisation right before its usage
The 'movewithdir' had a lot of related logic all around the 'mergecopies'.
However it is actually never containing anything until the very last loop in
that function. We move the (simplified) variable definition there for clarity
2016-10-11 02:15:23 +02:00
Mads Kiilerich
4ebd936629 cmdutil: satisfy expections in dirstateguard.__del__, even if __init__ fails
Python "delstructors" are terrible - this one because it assumed that __init__
had completed before it was called. That would not necessarily be the case if
the repository was read only or broken and saving the dirstate thus failed in
unexpected ways. That could give confusing warnings about missing '_active'
after failures.

To fix that, make sure all member variables are "declared" before doing
anything that possibly could fail. [Famous last words.]
2016-10-14 01:53:15 +02:00
Mads Kiilerich
39f2a13215 util: increase filechunkiter size to 128k
util.filechunkiter has been using a chunk size of 64k for more than 10 years,
also in years where Moore's law still was a law. It is probably ok to bump it
now and perhaps get a slight win in some cases.

Also, largefiles have been using 128k for a long time. Specifying that size
multiple times (or forgetting to do it) seems a bit stupid. Decreasing it to
64k also seems unfortunate.

Thus, we will set the default chunksize to 128k and use the default everywhere.
2016-10-14 01:53:15 +02:00
Yuya Nishihara
7e790cf836 revset: for x^2, do not take null as a valid p2 revision
Since we don't count null p2 revision as a parent, x^2 should never return
null even if null is explicitly populated.
2016-10-14 23:33:00 +09:00
Yuya Nishihara
e0c2008a2f revset: make follow() reject more than one start revisions
Taking only the last revision is inconsistent because ancestors(set) follows
all revisions given, and theoretically follow(startrev=set) == ancestors(set).
I'm planning to add a support for multiple start revisions, but that won't fit
to the 4.0 time frame. So reject multiple revisions now to avoid future BC.

len(revs) might be slow if revs were large, but we don't care since a valid
revs should have only one element.
2016-10-10 22:30:09 +02:00
Gregory Szorc
d694855e6f bundle2: only emit compressed chunks if they have data
This is similar to 72dcaa40df76. Not all calls into the compressor
return compressed data, as the compressor may buffer compressed
output internally. It is cheaper to check for empty chunks than to
send empty chunks through the generator.

When generating a gzip-v2 bundle of the mozilla-unified repo, this
change results in 50,093 empty chunks not being sent through the
generator (out of 1,902,996 total input chunks).
2016-10-15 17:10:53 -07:00
Stanislau Hlebik
b83f0c0687 update: warn if cwd was deleted
During update directories are deleted as soon as they have no entries.
But if current working directory is deleted then it cause problems
in complex commands like 'hg split'. This commit adds a warning
that will help users figure the problem faster.
2016-10-04 04:06:48 -07:00
Gregory Szorc
34b225b38d parsers: avoid PySliceObject cast on Python 3
PySlice_GetIndicesEx() accepts a PySliceObject* on Python 2 and a
PyObject* on Python 3. Casting to PySliceObject* on Python 3 was
yielding a compiler warning. So stop doing that.

With this patch, I no longer see any compiler warnings when
building the core extensions for Python 3!
2016-10-13 13:34:53 +02:00
Gregory Szorc
5f9c903265 bdiff: include util.h
Without this, IS_PY3K isn't define and the preprocessor uses the
incorrect module loading code, causing the module fail to load at
run-time.

After this patch, all our C extensions (except for watchman's) appear
to import correctly in Python 3!
2016-10-13 13:27:14 +02:00
Gregory Szorc
b946a5f05c parsers: alias more PyInt* symbols on Python 3
I feel dirty for having to do this. But this is currently our approach
for dealing with PyInt -> PyLong in Python 3 for this file.

This removes a ton of compiler warnings by fixing unresolved symbols.
2016-10-13 13:22:40 +02:00
Gregory Szorc
05e74565d0 manifest: use PyVarObject_HEAD_INIT
More appeasing the Python 3 and compiler overlords. The code is
equivalent.
2016-10-13 13:17:23 +02:00
Gregory Szorc
b5f731482f dirs: use PyVarObject_HEAD_INIT
This makes a compiler warning go away on Python 3.
2016-10-13 13:14:14 +02:00
Martijn Pieters
217717bef0 py3: use namedtuple._replace to produce new tokens 2016-10-13 09:27:37 +01:00
Martijn Pieters
f29fa5e5a1 py3: refactor token parsing to handle call args properly
The token parsing was getting unwieldy and was too naive about accessing
arguments.
2016-10-14 17:55:02 +01:00
Gregory Szorc
44818740cb pathencode: use assert() for PyBytes_Check()
This should have been added in 57bdf32c342e. I sent the patch to the
list prematurely.
2016-10-13 21:42:11 +02:00
Mads Kiilerich
ecb497971e merge: clarify warning for (not) merging flags without ancestor
Give hints why it can't merge and what it will do instead.
2016-10-12 12:22:18 +02:00
Mads Kiilerich
dc79a99eb6 merge: only show "cannot merge flags for %s" warning if flags are different 2016-10-12 12:22:18 +02:00
Gregory Szorc
a17dfc705a dirs: document Py_SIZE weirdness
Assigning to what looks like a function is clown shoes. Document that
it is a macro referring to a struct member.
2016-10-08 17:07:43 +02:00
Philippe Pepiot
82dc121fd0 commit: return 1 for interactive commit with no changes (issue5397)
For consistency with non interactive commit
2016-10-14 09:52:38 +02:00
Mads Kiilerich
7f0aee28c1 demandimport: disable lazy import of __builtin__
Demandimport uses the "try to import __builtin__, else use builtins" trick to
handle Python 3. External libraries and extensions might do something similar.
On Fedora 25 subversion-python-1.9.4-4.fc25.x86_64 will do just that (except
the opposite) ... and it failed all subversion convert tests because
demandimport was hiding that it didn't have builtins but should use
__builtin__.

The builtin module has already been imported when demandimport is loaded so
there is no point in trying to import it on demand. Just always ignore both
variants in demandimport.
2016-10-14 03:03:39 +02:00
Gregory Szorc
0ee2ea3be0 changelog: disable delta chains
This patch disables delta chains on changelogs. After this patch, new
entries on changelogs - including existing changelogs - will be stored
as the fulltext of that data (likely compressed). No delta computation
will be performed.

An overview of delta chains and data justifying this change follows.

Revlogs try to store entries as a delta against a previous entry (either
a parent revision in the case of generaldelta or the previous physical
revision when not using generaldelta). Most of the time this is the
correct thing to do: it frequently results in less CPU usage and smaller
storage.

Delta chains are most effective when the base revision being deltad
against is similar to the current data. This tends to occur naturally
for manifests and file data, since only small parts of each tend to
change with each revision. Changelogs, however, are a different story.

Changelog entries represent changesets/commits. And unless commits in a
repository are homogonous (same author, changing same files, similar
commit messages, etc), a delta from one entry to the next tends to be
relatively large compared to the size of the entry. This means that
delta chains tend to be short. How short? Here is the full vs delta
revision breakdown on some real world repos:

Repo             % Full    % Delta   Max Length
hg                45.8       54.2        6
mozilla-central   42.4       57.6        8
mozilla-unified   42.5       57.5       17
pypy              46.1       53.9        6
python-zstandard  46.1       53.9        3

(I threw in python-zstandard as an example of a repo that is homogonous.
It contains a small Python project with changes all from the same
author.)

Contrast this with the manifest revlog for these repos, where 99+% of
revisions are deltas and delta chains run into the thousands.

So delta chains aren't as useful on changelogs. But even a short delta
chain may provide benefits. Let's measure that.

Delta chains may require less CPU to read revisions if the CPU time
spent reading smaller deltas is less than the CPU time used to
decompress larger individual entries. We can measure this via
`hg perfrevlog -c -d 1` to iterate a revlog to resolve each revision's
fulltext. Here are the results of that command on a repo using delta
chains in its changelog and on a repo without delta chains:

hg (forward)
! wall 0.407008 comb 0.410000 user 0.410000 sys 0.000000 (best of 25)
! wall 0.390061 comb 0.390000 user 0.390000 sys 0.000000 (best of 26)

hg (reverse)
! wall 0.515221 comb 0.520000 user 0.520000 sys 0.000000 (best of 19)
! wall 0.400018 comb 0.400000 user 0.390000 sys 0.010000 (best of 25)

mozilla-central (forward)
! wall 4.508296 comb 4.490000 user 4.490000 sys 0.000000 (best of 3)
! wall 4.370222 comb 4.370000 user 4.350000 sys 0.020000 (best of 3)

mozilla-central (reverse)
! wall 5.758995 comb 5.760000 user 5.720000 sys 0.040000 (best of 3)
! wall 4.346503 comb 4.340000 user 4.320000 sys 0.020000 (best of 3)

mozilla-unified (forward)
! wall 4.957088 comb 4.950000 user 4.940000 sys 0.010000 (best of 3)
! wall 4.660528 comb 4.650000 user 4.630000 sys 0.020000 (best of 3)

mozilla-unified (reverse)
! wall 6.119827 comb 6.110000 user 6.090000 sys 0.020000 (best of 3)
! wall 4.675136 comb 4.670000 user 4.670000 sys 0.000000 (best of 3)

pypy (forward)
! wall 1.231122 comb 1.240000 user 1.230000 sys 0.010000 (best of 8)
! wall 1.164896 comb 1.160000 user 1.160000 sys 0.000000 (best of 9)

pypy (reverse)
! wall 1.467049 comb 1.460000 user 1.460000 sys 0.000000 (best of 7)
! wall 1.160200 comb 1.170000 user 1.160000 sys 0.010000 (best of 9)

The data clearly shows that it takes less wall and CPU time to resolve
revisions when there are no delta chains in the changelogs, regardless
of the direction of traversal. Furthermore, not using a delta chain
means that fulltext resolution in reverse is as fast as iterating
forward. So not using delta chains on the changelog is a clear CPU win
for reading operations.

An example of a user-visible operation showing this speed-up is revset
evaluation. Here are results for
`hg perfrevset 'author(gps) or author(mpm)'`:

hg
! wall 1.655506 comb 1.660000 user 1.650000 sys 0.010000 (best of 6)
! wall 1.612723 comb 1.610000 user 1.600000 sys 0.010000 (best of 7)

mozilla-central
! wall 17.629826 comb 17.640000 user 17.600000 sys 0.040000 (best of 3)
! wall 17.311033 comb 17.300000 user 17.260000 sys 0.040000 (best of 3)

What about 00changelog.i size?

Repo                Delta Chains     No Delta Chains
hg                    7,033,250         6,976,771
mozilla-central      82,978,748        81,574,623
mozilla-unified      88,112,349        86,702,162
pypy                 20,740,699        20,659,741

The data shows that removing delta chains from the changelog makes the
changelog smaller.

Delta chains are also used during changegroup generation. This
operation essentially converts a series of revisions to one large
delta chain. And changegroup generation is smart: if the delta in
the revlog matches what the changegroup is emitting, it will reuse
the delta instead of recalculating it. We can measure the impact
removing changelog delta chains has on changegroup generation via
`hg perfchangegroupchangelog`:

hg
! wall 1.589245 comb 1.590000 user 1.590000 sys 0.000000 (best of 7)
! wall 1.788060 comb 1.790000 user 1.790000 sys 0.000000 (best of 6)

mozilla-central
! wall 17.382585 comb 17.380000 user 17.340000 sys 0.040000 (best of 3)
! wall 20.161357 comb 20.160000 user 20.120000 sys 0.040000 (best of 3)

mozilla-unified
! wall 18.722839 comb 18.720000 user 18.680000 sys 0.040000 (best of 3)
! wall 21.168075 comb 21.170000 user 21.130000 sys 0.040000 (best of 3)

pypy
! wall 4.828317 comb 4.830000 user 4.820000 sys 0.010000 (best of 3)
! wall 5.415455 comb 5.420000 user 5.410000 sys 0.010000 (best of 3)

The data shows eliminating delta chains makes the changelog part of
changegroup generation slower. This is expected since we now have to
compute deltas for revisions where we could recycle the delta before.

It is worth putting this regression into context of overall changegroup
times. Here is the rough total CPU time spent in changegroup generation
for various repos while using delta chains on the changelog:

Repo              CPU Time (s)    CPU Time w/ compression
hg                  4.50              7.05
mozilla-central   111.1             222.0
pypy               28.68             75.5

Before compression, removing delta chains from the changegroup adds
~4.4% overhead to hg changegroup generation, 1.3% to mozilla-central,
and 2.0% to pypy. When you factor in zlib compression, these percentages
are roughly divided by 2.

While the increased CPU usage for changegroup generation is unfortunate,
I think it is acceptable because the percentage is small, server
operators (those likely impacted most by this) have other mechanisms
to mitigate CPU consumption (namely reducing zlib compression level and
pre-generated clone bundles), and because there is room to optimize this
in the future. For example, we could use the nullid as the base revision,
effectively encoding the full revision for each entry in the changegroup.
When doing this, `hg perfchangegroupchangelog` nearly halves:

mozilla-unified
! wall 21.168075 comb 21.170000 user 21.130000 sys 0.040000 (best of 3)
! wall 11.196461 comb 11.200000 user 11.190000 sys 0.010000 (best of 3)

This looks very promising as a future optimization opportunity.

It's worth that the changes in test-acl.t to the changegroup part size.
This is because revision 6 in the changegroup had a delta chain of
length 2 before and after this patch the base revision is nullrev.
When the base revision is nullrev, cg2packer.deltaparent() hardcodes
the *previous* revision from the changegroup as the delta parent.
This caused the delta in the changegroup to switch base revisions,
the delta to change, and the size to change accordingly. While the
size increased in this case, I think sizes will remain the same
on average, as the delta base for changelog revisions doesn't matter
too much (as this patch shows). So, I don't consider this a regression.
2016-10-13 12:50:27 +02:00
Gregory Szorc
748ec42334 revlog: add instance variable controlling delta chain use
This is to support disabling delta chains on the changelog in a
subsequent patch.
2016-09-24 12:25:37 -07:00
Gregory Szorc
eb9f859c39 changegroup: document deltaparent's choice of previous revision
As part of debugging low-level changegroup generation, I came across
what I initially thought was a weird behavior: changegroup v2 is
choosing the previous revision in the changegroup as a delta base
instead of p1. I was tempted to rewrite this to use p1, as p1
will delta better than prev in the common case. However, I realized
that taking p1 as the base would potentially require resolving a
revision fulltext and thus require more CPU for e.g. server-side
processing of getbundle requests.

This patch tweaks the code comment to note the choice of behavior.
It also notes there is room for a flag or config option to tweak
this behavior later: using p1 as the delta base would likely make
changegroups smaller at the expense of more CPU, which could be
beneficial for things like clone bundles.
2016-10-13 12:49:47 +02:00
Pierre-Yves David
d482e52866 help: backout 6f89f03ad369 (mark boolean flags with [no-] in help) for now
The ability to negate any boolean flags itself is great, but I think we are not
ready to expose the help side of it yet.

First, while there exist a handful of such flags whose default value can be
changed (eg: git diff, patchwork confirmation), there is only a few of them. The
users who benefit the most from this change are alias users and large
installation that can deploy extension to change behavior (eg: facebook
tweakdefault).  So the majority of user who will be affected by a large change
to command help that is not yet relevant to them. (I expect this to become
relevant when ui.progressive start to exists).

Below is an example of the impact of the new help on 'hg help diff':

  -r --rev REV [+]              revision
  -c --change REV               change made by revision
  -a --[no-]text                treat all files as text
  -g --[no-]git                 use git extended diff format
     --[no-]nodates             omit dates from diff headers
     --[no-]noprefix            omit a/ and b/ prefixes from filenames
  -p --[no-]show-function       show which function each change is in
     --[no-]reverse             produce a diff that undoes the changes
  -w --[no-]ignore-all-space    ignore white space when comparing lines
  -b --[no-]ignore-space-change ignore changes in the amount of white space
  -B --[no-]ignore-blank-lines  ignore changes whose lines are all blank
  -U --unified NUM              number of lines of context to show
     --[no-]stat                output diffstat-style summary of changes
     --root DIR                 produce diffs relative to subdirectory
  -I --include PATTERN [+]      include names matching the given patterns
  -X --exclude PATTERN [+]      exclude names matching the given patterns
  -S --[no-]subrepos            recurse into subrepositories

Another issue with the current state of help, the default value for the
flag is not conveyed to the user. For example in the 'backout' help, there is
no real distinction between "--[no-]backup" (default to True) and "--[no-]keep"
(default) to False:

  --[no-]backup        no backups
  --[no-]keep          do not modify working directory during strip

In addition, I've discussed with Augie Fackler and the last batch of the work on
this have burned him out quite some. Therefore he is not intending to perform
any more work on this topic. Quoting him, he would rather see the help part
backed out than spending more time on it.

I do not think we are ready to expose this to users in 4.0 (freeze in a week),
especially because we cannot expect quick improvement on these aspect as this
topic no longer have an owner. We should be able to reintroduce that change in
the future when someone get back on it and the main issues are solves:

* Introduction of  ui.progressive makes it relevant for a majority of user,
* Current default value are efficiently conveyed to the user.

(In addition, the excerpt from diff help show that we still have some issue with
some negative option like '--nodates' so further improvement are probably
welcome there.)
2016-10-09 03:11:18 +02:00
Augie Fackler
530be6f190 copy: distinguish "file exists" cases and add a hint (BC)
Users that want to add a copy record to an existing commit with 'hg
commit --amend' should be guided towards this workflow, rather than
reaching for some sort of uncommit-recommit flow. As part of this,
distinguish in the top-line error message whether the file merely
already exists (untracked) on disk or the file already exists in
history.

The full list of copy and rename cases and how they interact with
flags are listed below:

target exists  --after  --force  |  action
      n            n      *    |  copy
      n            y      *    |  (1)
  untracked        n      n    |  (4) NEWHINT
  untracked        n      y    |  (3)
  untracked        y      *    |  (2)
      y            n      n    |  (4) NEWHINT
      y            n      y    |  (3)
      y            y      n    |  (2)
      y            y      y    |  (3)
   deleted         n      n    |  copy
   deleted         n      y    |  (3)
   deleted         y      n    |  (1)
   deleted         y      y    |  (1)

* = don't care
(1) <src>: not recording move - <target> does not exist
(2) preserve target contents
(3) replace target contents
(4) <target>: not overwriting - file {exists,already committed}

Credit to Kevin for wholly rewriting my table to cover more cases we
discovered at the sprint.

I think this change gets the hints correct in all cases, but I'd
appreciate close inspection of the test cases to make sure I haven't
gotten turned around in here.
2016-09-19 17:15:39 -04:00
Mads Kiilerich
7afa73604d largefiles: use context for file closing
Make the code slightly smaller and safer (and more deeply indented).
2016-10-08 00:59:41 +02:00
Gregory Szorc
48ea4c11ea dirs: add comment about _PyBytes_Resize
So readers have a canonical function to compare this code to.
2016-10-13 10:59:29 +02:00
Pierre-Yves David
163070ae3d checkcopies: extract the '_related' closure
There is not need for it to be a closure.
2016-10-11 01:29:08 +02:00
Pierre-Yves David
2d670597fe checkcopies: add an inline comment about the '_related' call
This helps understanding the flow of the function.
2016-10-08 23:00:55 +02:00
Pierre-Yves David
71b0c4ef9c checkcopies: minor change to comment
This helped me understand the refactoring so this must be helpful.
2016-10-08 19:03:16 +02:00
Pierre-Yves David
1f966aa892 checkcopies: rename 'ca' to 'base'
This variable was named after the common ancestor. It is actually the merge
base that might differ from the common ancestor in the graft case. We rename the
variable before a larger refactoring to clarify the situation.
2016-10-08 18:38:42 +02:00
Pierre-Yves David
023c7518b5 bisect: extra a small initialisation outside of a loop
Having initialisation done during the first iteration is cute, but can be
avoided.
2016-08-24 05:09:46 +02:00
Martijn Pieters
6c2c90ea4c pycompat: only accept a bytestring filepath in Python 2 2016-10-10 23:11:15 +01:00