Commit Graph

1012 Commits

Author SHA1 Message Date
Yuya Nishihara
59eaa37546 py3: select input or raw_input by pycompat
This seems slightly cleaner.
2017-08-16 13:54:24 +09:00
Yuya Nishihara
9837558150 py3: make encoding.strio() an identity function on Python 2
It's the convention the other encoding.str*() functions follow. To make things
simple, this also drops kwargs from the strio() constructor.
2017-08-16 13:50:11 +09:00
Augie Fackler
a80f148d0c py3: introduce a wrapper for __builtins__.{raw_,}input()
In order to make this work, we have to wrap the io streams in a
TextIOWrapper so that __builtins__.input() can do unicode IO on Python
3. We can't just restore the original (unicode) sys.std* because we
might be running a cmdserver, and if we blindly restore sys.* to the
original values then we end up breaking the cmdserver. Sadly,
TextIOWrapper tries to close the underlying stream during its __del__,
so we have to make a sublcass to prevent that.

If you see errors like:

TypeError: a bytes-like object is required, not 'str'

On an input() or print() call on Python 3, the substitution of
sys.std* is probably the root cause.

A previous version of this change tried to put the bytesinput() method
in pycompat - it turns out we need to do some encoding handling, so we
have to be in a higher layer that's allowed to use
mercurial.encoding.encoding. As a result, this is in util for now,
with the TextIOWrapper subclass hiding in encoding.py. I'm not sure of
a better place for the time being.

Differential Revision: https://phab.mercurial-scm.org/D299
2017-07-24 14:38:40 -04:00
FUJIWARA Katsunori
c8b11bcb1f i18n: get translation entries for description of each compression engines
Now, hggettext can be applied safely on util.py, of which
i18nfunctions contains appropriate objects related to each compression
types.
2017-08-15 21:09:33 +09:00
FUJIWARA Katsunori
cabc81b3e9 i18n: use saved object to get actual function information if available
To list up available compression types instead of
".. bundlecompressionmarker" in "hg help bundlespec" output, proxy
object "docobject" is used, because:

- current online help system requires that __doc__ of registered
  object (maybe, function) is already well formatted in reST syntax

- bundletype() method of compressionengine classes is used to list up
  available compression types, but

- __doc__ of bundletype() object (= "instancemethod") is read-only

On the other hand, hggettext requires original function object, in
order to get document location in source code.

Therefore, description of each compression types isn't yet
translatable. Even if translatable, translators should make much
effort to determine location of original texts in source code.

To get actual function information, this patch makes hggettext use
function object saved as "_origfunc", if it is available. This patch
also changes bundlecompressiontopics() side, in order to explain how
these changes work easily.

This patch is a part of preparations for making description of each
compression types translatable.
2017-08-15 19:27:24 +09:00
Jun Wu
9f55a1848e util: make nogc effective for CPython
a3022f57803b made `util.nogc` a no-op. It was to optimize PyPy. But it slows
down CPython if many objects (like 300k+) are created.

For example, running `hg log -r .` without extensions in `hg-committed` with
14k+ obsmarkers have the following times:

  before        | after
  hg    | chg   | hg    | chg
  -----------------------------
  1.262 | 0.860 | 1.077 | 0.619 (seconds, best of 20 runs)

Therefore let's re-enable nogc for CPython.

Differential Revision: https://phab.mercurial-scm.org/D402
2017-08-14 22:28:59 -07:00
Martin von Zweigbergk
e5ad1ba424 util: add base class for transactional context managers
We have at least three types with a close() and a release() method
where the close() method is supposed to be called on success and the
release() method is supposed to be called last, whether successful or
not. Two of them (transaction and dirstateguard) already have
identical implementations of __enter__ and __exit__. Let's extract a
base class for this, so we reuse the code and so the third type
(transactionmanager) can also be used as a context manager.

Differential Revision: https://phab.mercurial-scm.org/D392
2017-07-28 22:42:10 -07:00
Augie Fackler
ef945af30b merge with stable 2017-08-10 18:55:33 -04:00
Augie Fackler
9a0febea27 merge with stable 2017-08-10 14:23:41 -04:00
Yuya Nishihara
509744ddfc ssh: unban the use of pipe character in user@host:port string
This vulnerability was fixed by the previous patch and there were more ways
to exploit than using '|shellcmd'. So it doesn't make sense to reject only
pipe character.

Test cases are updated to actually try to exploit the bug. As the SSH bridge
of git/svn subrepos are not managed by our code, the tests for non-hg subrepos
are just removed.

This may be folded into the original patches.
2017-08-07 22:22:28 +09:00
Yuya Nishihara
caba95785d util: fix sortdict.update() to call __setitem__() on PyPy (issue5639)
It appears that overriding __setitem__() doesn't work as documented on PyPy.
Let's patch it as before e5e7b1586953.

https://docs.python.org/2/library/collections.html#ordereddict-examples-and-recipes

The issue was ui.configitems() wasn't ordered correctly, so the pull command
was wrapped in different order.
2017-08-02 22:51:19 +09:00
Sean Farley
da301ac6a0 subrepo: add tests for svn rogue ssh urls (SEC)
'ssh://' has an exploit that will pass the url blindly to the ssh
command, allowing a malicious person to have a subrepo with
'-oProxyCommand' which could run arbitrary code on a user's machine. In
addition, at least on Windows, a pipe '|' is able to execute arbitrary
commands.

When this happens, let's throw a big abort into the user's face so that
they can inspect what's going on.
2017-07-31 16:44:17 -07:00
Sean Farley
608ad9eb9e util: add utility method to check for bad ssh urls (SEC)
Our use of SSH has an exploit that will parse the first part of an url
blindly as a hostname. Prior to this set of security patches, a url
with '-oProxyCommand' could run arbitrary code on a user's machine. In
addition, at least on Windows, a pipe '|' can be abused to execute
arbitrary commands in a similar fashion.

We defend against this by checking ssh:// URLs and looking for a
hostname that starts with a - or contains a |.

When this happens, let's throw a big abort into the user's face so
that they can inspect what's going on.
2017-07-28 16:32:25 -07:00
Alex Gaynor
088c6ecb28 util: remove dead code which used to be for old python2 versions
Differential Revision: https://phab.mercurial-scm.org/D107
2017-07-17 12:38:07 -04:00
Yuya Nishihara
fb236e4381 py3: convert arbitrary exception object to byte string more reliably
Our exception types implement __bytes__(), which should be tried first. Do
lossy encoding conversion as a last resort.
2017-08-03 23:02:32 +09:00
Durham Goode
87e4b5267e rebase: use one dirstateguard for when using rebase.singletransaction
This was previously landed as 4bc0c14fb501 but backed out in b63351f6a2 because
it broke hooks mid-rebase and caused conflict resolution data loss in the event
of unexpected exceptions. This new version adds the behavior back but behind a
config flag, since the performance improvement is notable in large repositories.

The old commit message was:

Recently we switched rebases to run the entire rebase inside a single
transaction, which dramatically improved the speed of rebases in repos with
large working copies. Let's also move the dirstate into a single dirstateguard
to get the same benefits. This let's us avoid serializing the dirstate after
each commit.

In a large repo, rebasing 27 commits is sped up by about 20%.

I believe the test changes are because us touching the dirstate gave the
transaction something to actually rollback.
(grafted from 9e3dc3a1638b9754b58a0cb26aaa75d868058109)
(grafted from 7d38b41d2266d9a02a15c64229fae0da5738dcec)

Differential Revision: https://phab.mercurial-scm.org/D135
2017-07-20 01:30:41 -07:00
Martin von Zweigbergk
cce08d0996 histedit: extract InterventionRequired transaction handling to utils
rebase will have similar logic, so let's extract it. Besides, it makes
the histedit code more readable.

We may want to parametrize acceptintervention() by the exception(s)
that should result in transaction close.

Differential Revision: https://phab.mercurial-scm.org/D66
2017-07-12 13:57:03 -07:00
Martin von Zweigbergk
dde7459563 util: remove unused ctxmanager
This was meant as a substitute for Python's "with" with multiple
context managers before we moved to Python 2.7. We're now on 2.7, so
we should have no reason to keep ctxmanager. "hg grep --all
ctxmanager" says that it was never used anyway.

Differential Revision: https://phab.mercurial-scm.org/D73
2017-07-13 09:51:50 -07:00
Pulkit Goyal
d7e69a39e2 py3: add b'' to make the regex pattern bytes 2017-06-25 03:11:55 +05:30
Pulkit Goyal
10c2f21384 py3: add b'' to make a triple quoted string bytes on Python 3
Transformer does not adds b'' in front of triple quoted strings to prevent
converting docs to bytes.
2017-06-24 19:57:50 +05:30
Yuya Nishihara
0c45446525 py3: add utility to forward __str__() to __bytes__()
It calls unifromlocal() instead of sysstr() because __bytes__() may contain
locale-dependent values such as paths.
2017-06-24 13:48:04 +09:00
Matt Harbison
fcd2a4509b plan9: drop py26 hacks 2017-06-16 18:42:03 -04:00
Siddharth Agarwal
d3ed1149d0 fsmonitor: don't write out state if identity has changed (issue5581)
Inspired by the dirstate fix in 39954a8760cd, this should fix any race
conditions with the fsmonitor state changing from underneath.

Since we now grab the wlock for any non-invalidate writes, the only situation
this appears to happen in is with a concurrent invalidation. Test that.
2017-06-12 15:34:31 -07:00
Siddharth Agarwal
e0e3b8e4ce filestat: move __init__ to frompath constructor
We're going to add a `fromfp` constructor soon, and this also allows a filestat
object for a non-existent file to be created.
2017-06-10 14:09:54 -07:00
FUJIWARA Katsunori
f010ae3777 util: make filestat.__eq__ return True if both of self and old have None stat
For convenience to compare two filestat objects regardless of
None-ness of stat field.
2017-06-09 13:07:48 +09:00
FUJIWARA Katsunori
49b9b7a367 util: make filestat.avoidambig() return whether ambiguity is avoided or not 2017-06-09 12:58:17 +09:00
Augie Fackler
9a722d4382 merge with stable 2017-06-03 16:33:28 -04:00
FUJIWARA Katsunori
80da112aee win32mbcs: avoid unintentional failure at colorization
Since 1d07d9da84a0, pycompat.bytestr() wrapped by win32mbcs returns
unicode object, if an argument is not byte-str object. And this causes
unexpected failure at colorization.

pycompat.bytestr() is used to convert from color effect "int" value to
byte-str object in color module. Wrapped pycompat.bytestr() returns
unicode object for such "int" value, because it isn't byte-str.

If this returned unicode object is used to colorize non-ASCII byte-str
in cases below, UnicodeDecodeError is raised at an operation between
them.

  - colorization uses "ansi" color mode, or

    Even though this isn't default on Windows, user might use this
    color mode for third party pager.

  - ui.write() is buffered with labeled=True

    Buffering causes "ansi" color mode internally, regardless of
    actual color mode. With "win32" color mode, extra escape sequences
    are omitted at writing data out.

    For example, with "win32" color mode, "hg status" doesn't fail for
    non-ASCII filenames, but "hg log" does for non-ASCII text, because
    the latter implies buffered formatter.

There are many "color effect" value lines in color.py, and making them
byte-str objects isn't suitable for fixing on stable. In addition to
it, pycompat.bytestr will be used to get byte-str object from any
types other than int, too.

To resolve this issue, this patch does:

  - replace pycompat.bytestr in checkwinfilename() with newly added
    hook point util._filenamebytestr, and

  - make win32mbcs reverse-wrap util._filenamebytestr
    (this is a replacement of 1d07d9da84a0)

This patch does two things above at same time, because separately
applying the former change adds broken revision (from point of view of
win32mbcs) to stable branch.

"_" prefix is added to "filenamebytestr", because it is win32mbcs
specific hook point.
2017-05-31 23:44:33 +09:00
Augie Fackler
6278186d0b util: pass sysstrs to warnings.filterwarnings
Un-breaks the Python 3 build.
2017-04-13 13:12:49 -04:00
Pierre-Yves David
a185960897 util: add a way to issue deprecation warning without a UI object
Our current deprecation warning mechanism relies on ui object. They are case
where we cannot have access to the UI object. On a general basis we avoid using
the python mechanism for deprecation warning because up to Python 2.6 it is
exposing warning to unsuspecting user who cannot do anything to deal with them.

So we build a "safe" strategy to hide this warnings behind a flag in an
environment variable. The test runner set this flag so that tests show these
warning.  This will help us marker API as deprecated for extensions to update
their code.
2017-04-04 11:03:29 +02:00
Gábor Stefanik
387861cc38 util: fix human-readable printing of negative byte counts
Apply the same human-readable printing rules to negative byte counts as to
positive ones. Fixes output of debugupgraderepo.
2017-04-10 18:16:30 +02:00
Gregory Szorc
3c5a0a039c util: make cookielib module available
In preparation for supporting sending cookies on HTTP requests.
2017-03-09 21:35:21 -08:00
Yuya Nishihara
af3a1d8aae sortdict: fix .pop() to return a value
My future patch will need it.
2017-04-09 11:57:09 +09:00
Pulkit Goyal
04352e4321 py3: replace str() with bytes() 2017-04-07 13:46:35 +05:30
Augie Fackler
f0863383db util: fix %-formatting on docstring by moving a closing parenthesis
We have to do the % formatting over the sysstr, since the things we're
going to splat into it are themselves sysstrs. This is probably
technically wrong-ish, since bt is probably actually a bytestr here,
but this fixes the immediate issue, which was that hg was broken on
Python 3.
2017-04-03 19:03:34 -04:00
Gregory Szorc
252949c70b util: document bundle compression
An upcoming patch will add support for documenting bundle
specifications in more detail. As part of this, we'd like to
enumerate available bundle compression formats. In order to do
this, we need to provide the help mechanism a dict of names
and objects with docstrings.

This patch adds docstrings to compengine.bundletype and adds
a function for retrieving a dict of them. The code is not yet
used.
2017-04-01 13:29:01 -07:00
Yuya Nishihara
40564e2c98 util: add helper to convert between LF and native EOL
See the next patch for why.
2017-03-29 21:40:15 +09:00
Yuya Nishihara
1163409660 util: extract pure tolf/tocrlf() functions from eol extension
This can be used for EOL conversion of text files.
2017-03-29 21:28:54 +09:00
Jun Wu
47e81ec208 hardlink: check directory's st_dev when copying files
Previously, when copying a file, copyfiles will compare src's st_dev with
dirname(dst)'s st_dev, to decide whether to enable hardlink or not.

That could have issues on Linux's overlayfs, where stating directories could
result in different st_dev from st_dev of stating files, even if both the
directories and the files exist in the overlay's upperdir.

This patch fixes it by checking dirname(src) instead. It's more consistent
because we are checking directories for both src and dest.

That fixes test-hardlinks.t running on common Docker setups.
2017-03-29 12:37:03 -07:00
Jun Wu
7621302bee hardlink: duplicate hardlink detection for copying files and directories
A later patch will change one of them so they diverge.
2017-03-29 12:26:46 -07:00
Jun Wu
29358be572 hardlink: extract topic text logic of copyfiles
The topic text shows whether it's "linking" or "copying", based on
"hardlink" value. The function is extracted so a later patch can reuse it.
2017-03-29 12:21:15 -07:00
Yuya Nishihara
91f8f53f3a statfs: make getfstype() raise OSError
It's better for getfstype() function to not suppress an error. Callers can
handle it as necessary. Now "hg debugfsinfo" will report OSError.
2017-03-25 17:25:23 +09:00
Denis Laxalde
a70f2fcec7 revset: factor out linerange processing into a utility function
Similar processing will be done in hgweb.webutil in forthcoming changeset.
2017-02-24 18:39:08 +01:00
Jun Wu
057a1065b9 util: enable hardlink for some BSD-family filesystems
Since we can now detect filesystems on FreeBSD and OSX. Add their major
filesystems (ufs, zfs for FreeBSD; hfs for OSX) to the hardlink whitelist.
2017-03-23 22:31:50 -07:00
Jun Wu
443e227575 util: use util.getfstype 2017-03-23 12:01:18 -07:00
Jun Wu
43dc9d6102 util: add a getfstype method
The util version is a thin wrapper of the osutil version, which is not
always available.
2017-03-23 11:58:45 -07:00
Jun Wu
2761a9ed0c util: enable hardlink for copyfile
This patch removes the global variable "allowhardlinks" that disables
hardlink in all cases, so hardlink gets enabled if the filesystem type is
whitelisted.

Third party extensions wanting to enable hardlink support unconditionally
can replace "_hardlinkfswhitelist.__contains__".
2017-03-12 01:03:23 -08:00
Jun Wu
d726efc89f util: disable hardlink for copyfile if fstype is outside a whitelist
Since osutil.getfstype is available, use it to detect filesystem types. The
whitelist currently includes common local filesystems on Linux where they
should have good hardlink support. We may add new filesystems for other
platforms later.
2017-03-12 00:23:07 -08:00
Gregory Szorc
56e2825efa py3: stop exporting urlparse from pycompat and util (API)
There are no consumers of this in tree.

Functions formerly available on this object/module can now be accessed
via {pycompat,util}.urlreq.
2017-03-21 22:47:49 -07:00
Gregory Szorc
28ebe71a15 util: use urlreq.unquote
pycompat.urlreq.unquote and pycompat.urlunquote effectively alias the
same thing. pycompat.urlunquote is only used once in the code base.
So let's switch to urlreq.unquote.

"Effectively" in the above paragraph is because pycompat.urlreq.unquote
aliases urllib.unquote and pycompat.urlunquote aliases urlparse.unquote
on Python 2. You might think one of urllib.unquote and urlparse.unquote
is an alias to the other, but you would be incorrect. In fact, these
functions are copies of each other. There is even a comment in the
CPython source code saying to keep them in sync. You can't make this
up.
2017-03-21 22:23:11 -07:00