Commit Graph

955 Commits

Author SHA1 Message Date
Yuya Nishihara
91f8f53f3a statfs: make getfstype() raise OSError
It's better for getfstype() function to not suppress an error. Callers can
handle it as necessary. Now "hg debugfsinfo" will report OSError.
2017-03-25 17:25:23 +09:00
Denis Laxalde
a70f2fcec7 revset: factor out linerange processing into a utility function
Similar processing will be done in hgweb.webutil in forthcoming changeset.
2017-02-24 18:39:08 +01:00
Jun Wu
057a1065b9 util: enable hardlink for some BSD-family filesystems
Since we can now detect filesystems on FreeBSD and OSX. Add their major
filesystems (ufs, zfs for FreeBSD; hfs for OSX) to the hardlink whitelist.
2017-03-23 22:31:50 -07:00
Jun Wu
443e227575 util: use util.getfstype 2017-03-23 12:01:18 -07:00
Jun Wu
43dc9d6102 util: add a getfstype method
The util version is a thin wrapper of the osutil version, which is not
always available.
2017-03-23 11:58:45 -07:00
Jun Wu
2761a9ed0c util: enable hardlink for copyfile
This patch removes the global variable "allowhardlinks" that disables
hardlink in all cases, so hardlink gets enabled if the filesystem type is
whitelisted.

Third party extensions wanting to enable hardlink support unconditionally
can replace "_hardlinkfswhitelist.__contains__".
2017-03-12 01:03:23 -08:00
Jun Wu
d726efc89f util: disable hardlink for copyfile if fstype is outside a whitelist
Since osutil.getfstype is available, use it to detect filesystem types. The
whitelist currently includes common local filesystems on Linux where they
should have good hardlink support. We may add new filesystems for other
platforms later.
2017-03-12 00:23:07 -08:00
Gregory Szorc
56e2825efa py3: stop exporting urlparse from pycompat and util (API)
There are no consumers of this in tree.

Functions formerly available on this object/module can now be accessed
via {pycompat,util}.urlreq.
2017-03-21 22:47:49 -07:00
Gregory Szorc
28ebe71a15 util: use urlreq.unquote
pycompat.urlreq.unquote and pycompat.urlunquote effectively alias the
same thing. pycompat.urlunquote is only used once in the code base.
So let's switch to urlreq.unquote.

"Effectively" in the above paragraph is because pycompat.urlreq.unquote
aliases urllib.unquote and pycompat.urlunquote aliases urlparse.unquote
on Python 2. You might think one of urllib.unquote and urlparse.unquote
is an alias to the other, but you would be incorrect. In fact, these
functions are copies of each other. There is even a comment in the
CPython source code saying to keep them in sync. You can't make this
up.
2017-03-21 22:23:11 -07:00
Ryan McElroy
ac0305dcbb util: use tryunlink in unlinkpath
We just introduced a func to attempt a file removal. Start using it.
2017-03-21 06:50:28 -07:00
Ryan McElroy
ff386f9a99 util: add tryunlink function
Throughout mercurial cdoe, there is a common pattern of attempting to remove
a file and ignoring ENOENT errors. Let's move this into a common function to
allow for cleaner code.
2017-03-21 06:50:28 -07:00
Ryan McElroy
4456f7562c util: unify unlinkpath
Previously, there were two slightly different versions of unlinkpath between
windows and posix, but these differences were eliminated in previous patches.
Now we can unify these two code paths inside of the util module.
2017-03-21 06:50:28 -07:00
Augie Fackler
0598bcdf11 util: reference __main__ in sys.modules as a sysstr 2017-03-19 01:19:27 -04:00
Augie Fackler
c476b55519 util: use bytes re on bytes input in fspath
Fixes `hg add` on Python 3.
2017-03-19 00:16:39 -04:00
Augie Fackler
585f46ae2f util: use pycompat.bytestr in checkwinfilename
Fixes `hg add` on python3.
2017-03-19 00:16:08 -04:00
Yuya Nishihara
687bfe4011 py3: call codecs.escape_decode() directly
The same rule as 9ecf9a0c9837 applies.
2017-03-17 23:48:22 +09:00
Yuya Nishihara
dc88179a4e util: wrap s.decode('string_escape') calls for future py3 compatibility 2017-03-17 23:42:46 +09:00
Pierre-Yves David
fc2b521909 util: explicitly tests for None
Changeset 3b9cdb72931f removed the mutable default value, but did not explicitly
tested for None. Such implicit checking can introduce semantic and performance
issue. We move to an explicit check for None as recommended by PEP8:

https://www.python.org/dev/peps/pep-0008/#programming-recommendations
2017-03-15 15:07:14 -07:00
Yuya Nishihara
4744dd2998 py3: call codecs.escape_encode() directly
string_escape doesn't exist on Python 3, but fortunately the undocumented
codecs.escape_encode() function exists on CPython 2.6, 2.7, 3.5 and PyPy 5.6.
So let's use it for now.

http://stackoverflow.com/a/23151714
2017-03-15 23:28:39 +09:00
Yuya Nishihara
02022fc3c5 util: wrap s.encode('string_escape') call for future py3 compatibility 2017-03-15 23:06:50 +09:00
Yuya Nishihara
99a868d61d py3: call strftime() with native str type
Since strftime() may contain non-ascii character if locale set, we use
strfrom/tolocal().

Now "hg tip" works.
2017-03-13 09:19:07 -07:00
Yuya Nishihara
af7f25fdb3 encoding: add converter between native str and byte string
This kind of encoding conversion is unavoidable on Python 3.
2017-03-13 09:12:56 -07:00
Yuya Nishihara
dcade16cf7 encoding: factor out unicode variants of from/tolocal()
Unfortunately, these functions will be commonly used on Python 3.
2017-03-13 09:11:08 -07:00
Rishabh Madan
8a9d68951e py3: use iter() instead of iterkeys() 2017-03-16 04:53:23 +05:30
Gregory Szorc
1c399d6b97 util: make strdate's defaults default value a dict
It was specified to be an empty list in 3d8abfdaa08a in 2007.
It was correct at the time. But when the function was
refactored in 7bca0f2718ab (2010), it started expecting a dict.
I guess this code path is untested?

Thanks to Yuya for spotting this.
2017-03-14 08:51:35 -07:00
Gregory Szorc
fbefc30c75 util: don't use mutable default argument value
I don't think this is any tight loops and we'd need to worry about
PyObject creation overhead. Also, I'm pretty sure strptime()
will be much slower than PyObject creation (date parsing is
surprisingly slow).
2017-03-12 21:54:32 -07:00
Augie Fackler
d5d09dfdf4 util: teach url object about __bytes__
__str__ tries to do something reasonable, but someone else more
familiar with encoding bugs should check my work.
2017-03-12 03:33:22 -04:00
Pulkit Goyal
7deacd3d03 util: pass encoding.[encoding|encodingmode] as unicodes
We need to pass str to encode() and decode().
2017-03-12 07:35:13 +05:30
Mads Kiilerich
6cbae57046 util: add debugstacktrace depth limit
Useful when you don't care about the start of the stack, but only want to see
the last entries.
2015-01-14 01:15:26 +01:00
Mads Kiilerich
97ac40d793 util: strip trailing newline from debugstacktrace message
This makes the function more convenient to use as drop-in replacement for
ui.write & co.
2015-01-16 04:26:40 +01:00
Durham Goode
56acb0faaa util: add allowhardlinks module variable
To enable extensions to enable hardlinks for certain environments, let's move
the 'if False' to be an 'if allowhardlinks' and let extensions modify the
allowhardlinks variable.

Tests on linux ext4 pass with it set to True and to False.
2017-03-02 10:12:40 -08:00
Yuya Nishihara
8072a1daee chg: deduplicate error handling of ui.system()
This moves 'onerr' handling from low-level util.system() to higher level,
which seems better API separation.
2017-02-19 01:16:45 +09:00
Pulkit Goyal
22dbb9466c py3: use pycompat.fsencode() to convert __file__ to bytes
__file__ returns unicodes on Python 3. This patch uses pycompat.fsencode() to
convert them to bytes.
2017-02-20 18:40:42 +05:30
Pulkit Goyal
7f47e65fef py3: convert the mode argument of os.fdopen to unicodes
Couple of these from the earlier series got lost while rebasing. So this patch
converts them again.
2017-02-16 17:30:35 +05:30
Simon Farnsworth
e0b70e4f7f mercurial: switch to util.timer for all interval timings
util.timer is now the best available interval timer, at the expense of not
having a known epoch. Let's use it whenever the epoch is irrelevant.
2017-02-15 13:17:39 -08:00
Simon Farnsworth
9415b63fba util: introduce timer()
As documented for timeit.default_timer, there are better timers available for
performance measures on some platforms. These timers don't have a set epoch,
and thus are only useful for interval measurements, but have higher
resolution, and thus get you a better measurement overall.

Use the same selection logic as Python's timeit.default_timer. This is a
platform clock on Python 2 and early Python 3, and time.perf_counter on Python
3.3 and later (where time.perf_counter is introduced as the best timer to use).
2017-02-15 11:53:59 -08:00
Pulkit Goyal
56031921a5 py3: convert the mode argument of os.fdopen to unicodes (2 of 2) 2017-02-13 22:15:28 +05:30
Simon Farnsworth
6a6899c4d4 util: always force line buffered stdout when stdout is a tty (BC)
pager replaced stdout with a line buffered version to work around glibc
deciding on a buffering strategy on the first write to stdout. This is going
to make my next patch hard, as replacing stdout will make tracking time
spent blocked on it more challenging.

Move the line buffering requirement to util.py, and remove it from pager.
This means that the abuse of ui.formatted=True and pager set to cat or equivalent
no longer results in a line-buffered output to a pipe, hence (BC), although
I don't expect anyone to be affected
2017-02-03 15:10:27 -08:00
Martin von Zweigbergk
06f115a93e util: make sortdict.keys() return a copy
dict.keys() is documented to return a copy, so it's surprising that
sortdict.keys() did not. I noticed this because we have an extension
that calls readlocaltags(). That method tries to remove any tags that
point to non-existent revisions (most likely stripped). However, since
it's unintentionally working on the instance it's modifying, it
sometimes fails to remove tags when there are multiple bad tags in a
row. This was not caught because localrepo.tags() does an additional
layer of filtering.

sortdict is also used in other places, but I have not checked whether
its keys() and/or __delitem__() methods are used there.
2017-01-30 22:58:56 -08:00
Pulkit Goyal
5a0e39fb56 util: add length argument to util.buffer()
util.buffer() either returns inbuilt buffer function or defines a new one which
slices. The inbuilt buffer() also has a length argument which is missing from
the ones we defined. This patch adds that length argument.
2017-01-14 20:05:15 +05:30
Gregory Szorc
4a3b8df214 util: compression APIs to support revlog decompression
Previously, compression engines had APIs for performing revlog
compression but no mechanism to perform revlog decompression. This
patch changes that.

Revlog decompression is slightly more complicated than compression
because in the compression case there is (currently) only a single
engine that can be used at a time. However for decompression, a
revlog could contain chunks from multiple compression engines. This
means decompression needs to map to multiple engines and
decompressors. This functionality is outside the scope of this patch.
But it drives the decision for engines to declare a byte header
sequence that identifies revlog data as belonging to an engine and
an API for obtaining an engine from a revlog header.
2017-01-02 13:27:20 -08:00
Gregory Szorc
29c30e4b7e util: compression APIs to support revlog compression
As part of "zstd all of the things," we need to teach revlogs to
use non-zlib compression formats. Because we're routing all compression
via the "compression manager" and "compression engine" APIs, we need to
introduction functionality there for performing revlog operations.

Ideally, revlog compression and decompression operations would be
implemented in terms of simple "compress" and "decompress" primitives.
However, there are a few considerations that make us want to have a
specialized primitive for handling revlogs:

1) Performance. Revlogs tend to do compression and especially
   decompression operations in batches. Any overhead for e.g.
   instantiating a "context" for performing an operation can be
   noticed. For this reason, our "revlog compressor" primitive is
   reusable. For zstd, we reuse the same compression "context" for
   multiple operations. I've measured this to have a performance
   impact versus constructing new contexts for each operation.

2) Specialization. By having a primitive dedicated to revlog use,
   we can make revlog-specific choices and leave the door open for
   more functionality in the future. For example, the zstd revlog
   compressor may one day make use of dictionary compression.

A future patch will introduce a decompress() on the compressor
object.

The code for the zlib compressor is basically copied from
revlog.compress(). Although it doesn't handle the empty input
case, the null first byte case, and the 'u' prefix case. These
cases will continue to be handled in revlog.py once that code is
ported to use this API.
2017-01-02 12:39:03 -08:00
Matt Harbison
86e0681833 util: teach stringmatcher to handle forced case insensitive matches
The 'author' and 'desc' revsets are documented to be case insensitive.
Unfortunately, this was implemented in 'author' by forcing the input to
lowercase, including for regex like '\B'.  (This actually inverts the meaning of
the sequence.)  For backward compatibility, we will keep that a case insensitive
regex, but by using matcher options instead of brute force.

This doesn't preclude future hypothetical 'icase-literal:' style prefixes that
can be provided by the user.  Such user specified cases can probably be handled
up front by stripping 'icase-', setting the variable, and letting it drop
through the existing code.
2017-01-11 21:47:19 -05:00
Gregory Szorc
73aa240fe1 util: declare wire protocol support of compression engines
This patch implements a new compression engine API allowing
compression engines to declare support for the wire protocol.

Support is declared by returning a compression format string
identifier that will be added to payloads to signal the compression
type of data that follows and default integer priorities of the
engine.

Accessor methods have been added to the compression engine manager
class to facilitate use.

Note that the "none" and "bz2" engines declare wire protocol support
but aren't enabled by default due to their priorities being 0. It
is essentially free from a coding perspective to support these
compression formats, so we do it in case anyone may derive use from
it.
2016-12-24 13:51:12 -07:00
Remi Chaintron
dfc79cbfc3 revlog: flag processor
Add the ability for revlog objects to process revision flags and apply
registered transforms on read/write operations.

This patch introduces:
- the 'revlog._processflags()' method that looks at revision flags and applies
  flag processors registered on them. Due to the need to handle non-commutative
  operations, flag transforms are applied in stable order but the order in which
  the transforms are applied is reversed between read and write operations.
- the 'addflagprocessor()' method allowing to register processors on flags.
  Flag processors are defined as a 3-tuple of (read, write, raw) functions to be
  applied depending on the operation being performed.
- an update on 'revlog.addrevision()' behavior. The current flagprocessor design
  relies on extensions to wrap around 'addrevision()' to set flags on revision
  data, and on the flagprocessor to perform the actual transformation of its
  contents. In the lfs case, this means we need to process flags before we meet
  the 2GB size check, leading to performing some operations before it happens:
  - if flags are set on the revision data, we assume some extensions might be
    modifying the contents using the flag processor next, and we compute the
    node for the original revision data (still allowing extension to override
    the node by wrapping around 'addrevision()').
  - we then invoke the flag processor to apply registered transforms (in lfs's
    case, drastically reducing the size of large blobs).
  - finally, we proceed with the 2GB size check.

Note: In the case a cachedelta is passed to 'addrevision()' and we detect the
flag processor modified the revision data, we chose to trust the flag processor
and drop the cachedelta.
2017-01-10 16:15:21 +00:00
Jun Wu
c50e85b0d3 util: extract the logic calculating environment variables
The method will be reused in chgserver. Move it out so it can be reused.
2017-01-10 06:58:02 +08:00
Pulkit Goyal
007bf9e678 py3: replace sys.executable with pycompat.sysexecutable
sys.executable returns unicodes on Python 3. This patch replaces occurences of
sys.executable with pycompat.sysexecutable.
2016-12-20 00:20:07 +05:30
Pulkit Goyal
d35cffe17d py3: replace sys.platform with pycompat.sysplatform (part 2 of 2) 2016-12-19 02:26:41 +05:30
Pulkit Goyal
4780c32e4c py3: replace os.name with pycompat.osname (part 1 of 2)
os.name returns unicodes on py3 and we have pycompat.osname which returns
bytes. This series of 2 patches will change every ocurrence of os.name with
pycompat.osname.
2016-12-19 00:16:52 +05:30
Pulkit Goyal
24a9271007 py3: replace os.environ with encoding.environ (part 4 of 5) 2016-12-18 02:06:00 +05:30