Commit Graph

982 Commits

Author SHA1 Message Date
Bryan O'Sullivan
506a2494ba util: rename ctxmanager's __call__ method to enter 2016-01-14 09:31:01 -08:00
Bryan O'Sullivan
0393e42ac3 util: simplify file I/O functions using context managers 2016-01-12 14:49:35 -08:00
Bryan O'Sullivan
a6e4850fe2 util: replace file I/O with readfile 2016-01-12 16:16:19 -08:00
Matt Harbison
4fc4388f40 util: adjust hgcmd() to handle frozen Mercurial on OS X
Previously, 'hg serve -d' was trying to exec the bundled python executable,
which failed with:

    Unknown option: --
    usage: python [option] ...
    Try 'python -h'...
    abort: child process failed to start

See the previous patch for details about the content of the various command
variables.  Note that unlike the previous patch here an application bundling
Mercurial could set $HG in the environment to get the correct result, there
isn't anything that a bundling application could do to get the correct result
here.

'hg serve -d' now launches under TortoiseHg, and there is a process listed in
the background, but a client process cannot connect to it for some reason, so
more investigation is needed.
2016-01-10 18:15:39 -05:00
Matt Harbison
c25bbe3998 util: adjust hgexecutable() to handle frozen Mercurial on OS X
sys.executable is "$appbundle/Contents/MacOS/python" when Mercurial is bundled
in a frozen app bundle on OS X, so that isn't appropriate.  It appears that this
was only visible for things launched via util.system(), like external hooks,
where $HG was set wrong.

It appears that Mercurial also uses 'sys.modules['__main__'].__file__' (here)
and 'sys.argv[0]' (in platform.gethgcmd()) to figure out the command to spawn.
In both cases, this points to "$appbundle/Contents/Resources/hg", which invokes
the system python since "/usr/bin/env python" is on the shebang line.  On my
system with a screwed up python install, I get an error importing the os module
if this script is invoked.

We could take the dirname of sys.executable and join 'hg' instead of this if we
want to be paranoid, but py2app boostrap is setting the environment variable
since 0.1.6 (current version is 0.9), so it seems safe and we might as well use
it.
2016-01-10 17:56:08 -05:00
Matt Harbison
38b5ec9395 util: adjust 'datapath' to be correct in a frozen OS X package
Apparently unlike py2exe, py2app copies the Mercurial source tree as-is to a
Contents/Resources subdirectory of an app bundle, and places its binary stub in
Contents/MacOS.  (The Windows install has the 'hgext' and 'mercurial' modules in
'lib/library.zip', while the help and templates subdirectories have been moved
out of the mercurial directory to the root of the installation.  I assume that
the python code living in a zip file is why "py2exe doesn't support __file__".)
Therefore, prior to this change, Mercurial in a frozen app bundle on OS X would
go looking for help *.txt, templates and locale info in Contents/MacOS, where
they don't exist.

There are only a handful of places that test for frozen, and not all of them are
wrong for OS X, so it seems wiser to handle them on a case by case basis, rather
that try to change mainfrozen().  The remaining cases are:

  1) util.hgexecutable() wrongly points to the bundled python executable, and
       affects $HG in util.system() launched processes (e.g. external hooks)
  2) util.hgcmd() wrongly points to the bundled python executable, but it seems
       to only affect 'hg serve -d'
  3) hook._pythonhook() may be OK, since I didn't see anything outrageous when
       printing sys.path from an internal hook.  I'm not sure if this special
       case is needed on OS X though.
  4) sslutil._plainapplepython() is OK, because sys.executable is not
       /usr/bin/python, nor is it in /System/Library/Frameworks
2016-01-10 17:49:01 -05:00
Augie Fackler
68c57ea46e util: don't capture exception with a name since we don't use it
Spotted by pyflakes.
2016-01-13 14:41:10 -05:00
Bryan O'Sullivan
e1c353db12 util: introduce ctxmanager, to avoid nested try/finally blocks
This is similar in spirit to contextlib.nested in Python <= 2.6,
but uses an extra level of indirection to avoid its inability to
clean up if an __enter__ method raises an exception.

Why add this mechanism?  It greatly simplifies scoped resource
management, and lets us eliminate several hundred lines of try/finally
blocks.  In many of these cases the "finally" is separated from the
"try" by hundreds of lines of code, which makes the connection
between resource acquisition and disposal difficult to follow.

(The preferred mechanism would be the "multi-with" syntax of 2.7+,
but Mercurial can't move to 2.7 for a while.)

Intended use:

>>> with ctxmanager(lambda: file('foo'), lambda: file('bar')) as c:
>>>    f1, f2 = c()

This will open both foo and bar when c() is invoked, and will close
both upon exit from the block.  If the attempt to open bar raises
an exception, the block will not be entered - but foo will still
be closed.
2016-01-11 15:25:43 -08:00
Gregory Szorc
0708caa27a util: remove outdated comment about construction overhead
An old implementation of this class (possibly only in my local repo)
allocated nodes in the cache during construction time, making
__init__ slow for large cache capacities. The current implementation
lazily grow the cache size, making this comment wrong.
2016-01-05 20:52:34 -08:00
Eric Sumner
c24dabdde6 lrucachedict: add copy method
This diff implements the standard dict copy() method for lrucachedicts, which
will be used in the pushrebase extension to make a copy of the manifestcache.
2015-12-30 13:10:53 -08:00
Matt Mackall
e9718e4615 merge with stable 2015-12-16 17:40:01 -06:00
Siddharth Agarwal
5544ae6c91 copyfile: add an optional parameter to copy other stat data
Contrary to the comment, I didn't see any evidence that we were copying
atime/mtime at all. This adds a parameter to copyfile to optionally copy it and
other stat data, with the default being to not copy it.

Many systems don't support changing the timestamp of a symlink, but we don't
need that in general anyway -- copystat is mostly useful for editors, most of
which will dereference symlinks anyway.
2015-12-12 11:00:04 -08:00
Gregory Szorc
f72b4b4450 util: reimplement lrucachedict
As part of attempting to more aggressively use the existing
lrucachedict, collections.deque operations were frequently
showing up in profiling output, negating benefits of caching.

Searching the internet seems to tell me that the most efficient
way to implement an LRU cache in Python is to have a dict indexing
the cached entries and then to use a doubly linked list to track
freshness of each entry. So, this patch replaces our existing
lrucachedict with a version using such a pattern.

The recently introduced perflrucachedict command reveals the
following timings for 10,000 operations for the following cache
sizes for the existing cache:

n=4     init=0.004079 gets=0.003632 sets=0.005188 mixed=0.005402
n=8     init=0.004045 gets=0.003998 sets=0.005064 mixed=0.005328
n=16    init=0.004011 gets=0.004496 sets=0.005021 mixed=0.005555
n=32    init=0.004064 gets=0.005611 sets=0.005188 mixed=0.006189
n=64    init=0.003975 gets=0.007684 sets=0.005178 mixed=0.007245
n=128   init=0.004121 gets=0.012005 sets=0.005422 mixed=0.009471
n=256   init=0.004143 gets=0.020295 sets=0.005227 mixed=0.013612
n=512   init=0.004039 gets=0.036703 sets=0.005243 mixed=0.020685
n=1024  init=0.004193 gets=0.068142 sets=0.005251 mixed=0.033064
n=2048  init=0.004070 gets=0.133383 sets=0.005160 mixed=0.050359
n=4096  init=0.004053 gets=0.265194 sets=0.004868 mixed=0.048352
n=8192  init=0.004087 gets=0.542218 sets=0.004562 mixed=0.032753
n=16384 init=0.004106 gets=1.064055 sets=0.004179 mixed=0.020367
n=32768 init=0.004034 gets=2.097620 sets=0.004260 mixed=0.013031
n=65536 init=0.004108 gets=4.106390 sets=0.004268 mixed=0.010191

As the data shows, the existing cache's retrieval performance
diminishes linearly with cache size. (Keep in mind the microbenchmark
is testing 100% cache hit rate.)

The new cache implementation reveals the following:

n=4     init=0.006665 gets=0.006541 sets=0.005733 mixed=0.006876
n=8     init=0.006649 gets=0.006374 sets=0.005663 mixed=0.006899
n=16    init=0.006570 gets=0.006504 sets=0.005799 mixed=0.007057
n=32    init=0.006854 gets=0.006459 sets=0.005747 mixed=0.007034
n=64    init=0.006580 gets=0.006495 sets=0.005740 mixed=0.006992
n=128   init=0.006534 gets=0.006739 sets=0.005648 mixed=0.007124
n=256   init=0.006669 gets=0.006773 sets=0.005824 mixed=0.007151
n=512   init=0.006701 gets=0.007061 sets=0.006042 mixed=0.007372
n=1024  init=0.006641 gets=0.007620 sets=0.006387 mixed=0.007464
n=2048  init=0.006517 gets=0.008598 sets=0.006871 mixed=0.008077
n=4096  init=0.006720 gets=0.010933 sets=0.007854 mixed=0.008663
n=8192  init=0.007383 gets=0.015969 sets=0.010288 mixed=0.008896
n=16384 init=0.006660 gets=0.025447 sets=0.011208 mixed=0.008826
n=32768 init=0.006658 gets=0.044390 sets=0.011192 mixed=0.008943
n=65536 init=0.006836 gets=0.082736 sets=0.011151 mixed=0.008826

Let's go through the results.

The new cache takes longer to construct. ~6.6ms vs ~4.1ms. However,
this is measuring 10,000 __init__ calls, so the difference is
~0.2us/instance. We currently only create lrucachedict for manifest
instances, so this regression is not likely relevant.

The new cache is slightly slower for retrievals for cache sizes
< 1024. It's worth noting that the only existing use of lurcachedict
is in manifest.py and the default cache size is 4. This regression
is worrisome. However, for n=4, the delta is ~2.9s for 10,000 lookups,
or ~0.29us/op. Again, this is a marginal regression and likely not
relevant in the real world. Timing `hg log -p -l 100` for
mozilla-central reveals that cache lookup times are dominated by
decompression and fulltext resolution (even with lz4 manifests).

The new cache is significantly faster for retrievals at larger
capacities. Whereas the old implementation has retrieval performance
linear with cache capacity, the new cache is constant time until much
larger values. And, when it does start to increase significantly, it
is a few magnitudes faster than the current cache.

The new cache does appear to be slower for sets when capacity is large.
However, performance is similar for smaller capacities. Of course,
caches should generally be optimized for retrieval performance because
if a cache is getting more sets than gets, it doesn't really make
sense to cache. If this regression is worrisome, again, taking the
largest regression at n=65536 of ~6.9ms for 10,000 results in a
regression of ~0.68us/op. This is not significant in the grand scheme
of things.

Overall, the new cache is performant at retrievals at much larger
capacity values which makes it a generally more useful cache backend.
While there are regressions, their absolute value is extremely small.
Since we aren't using lrucachedict aggressively today, these
regressions should not be relevant. The improved scalability of
lrucachedict should enable us to more aggressively utilize
lrucachedict for more granular caching (read: higher capacity caches)
in the near future. The impetus for this patch is to establish a cache
of decompressed revlog revisions, notably manifest revisions. And since
delta chains can grow to >10,000 and cache hit rate can be high, the
improved retrieval performance of lrucachedict should be relevant.
2015-12-06 19:04:10 -08:00
Yuya Nishihara
b65525f9c0 util: rename argument of isatty()
In general, "fd" is a file descriptor, but isatty() expects a file object.
We should call it "fp" or "fh".
2015-12-13 18:48:35 +09:00
Gregory Szorc
7b3fa04da1 util: use absolute_import 2015-12-12 23:14:08 -08:00
Gregory Szorc
c63ae89667 util: make hashlib import unconditional
hashlib was added in Python 2.5. As far as I can tell, SHA-512 is always
available in 2.6+. So move the hashlib import to the top of the file and
remove the one-off handling of SHA-512.
2015-12-12 23:30:37 -05:00
Gregory Szorc
11a7cac430 util: add versiontuple() for returning parsed version information
We have similar code in dispatch.py. Another consumer is about to be
created, so establish a generic function in an accessible location.
2015-11-24 14:23:51 -08:00
Gregory Szorc
5f5da12c81 util.datestr: use divmod()
We were computing the quotient and remainder of a division operation
separately. The built-in divmod() function allows us to do this with
a single function call. Do that.
2015-11-14 17:30:10 -08:00
Matt Mackall
ab46ec3b99 util: drop statmtimesec
We've globablly forced stat to return integer times which agrees with
our extension code, so this is no longer needed.

This speeds up status on mozilla-central substantially:

$ hg perfstatus
! wall 0.190179 comb 0.180000 user 0.120000 sys 0.060000 (best of 53)
$ hg perfstatus
! wall 0.275729 comb 0.270000 user 0.210000 sys 0.060000 (best of 36)
2015-11-19 13:15:17 -06:00
Matt Mackall
62d0fc250b util: disable floating point stat times (issue4836)
Alternate fix for this issue which avoids putting extra function calls
and exception handling in the fast path.

For almost all purposes, integer timestamps are preferable to
Mercurial. It stores integer timestamps in the dirstate and would thus
like to avoid doing any float/int comparisons or conversions. We will
continue to have to deal with 1-second granularity on filesystems for
quite some time, so this won't significantly hinder our capabilities.

This has some impact on our file cache validation code in that it
lowers timestamp resolution. But as we still have to deal with
low-resolution filesystems, we're not relying on this anyway.

An alternate approach is to use stat[ST_MTIME], which is guaranteed to
be an integer. But since this support isn't already in our extension,
we can't depend on it being available without adding a hard Python->C
API dependency that's painful for people like yours truly who have
bisect regularly and people without compilers.
2015-11-19 13:21:24 -06:00
Sean Farley
f6841c5b51 util: also catch IndexError
This makes life so, so much easier for hgwatchman, which provides a named tuple
but throws an IndexError instead of a TypeError.
2015-10-13 16:05:30 -07:00
Pierre-Yves David
30913031d4 error: get Abort from 'error' instead of 'util'
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.

For great justice.
2015-10-08 12:55:45 -07:00
Yuya Nishihara
5715ebc02c util: use tuple accessor to get accurate st_mtime value (issue4836)
Because st.st_mtime is computed as 'sec + 1e-9 * nsec' and double is too narrow
to represent nanoseconds, int(st.st_mtime) can be 'sec + 1'. Therefore, that
value could be different from the one got by osutils.listdir().

This patch fixes the problem by accessing to raw st_mtime by tuple index.

It catches TypeError to fall back to st.st_mtime because our osutil.stat does
not support tuple index. In dirstate.normal(), 'st' is always a Python stat,
but in dirstate.status(), it can be either a Python stat or an osutil.stat.

Thanks to vgatien-baron@janestreet.com for finding the root cause of this
subtle problem.
2015-10-04 22:35:36 +09:00
Yuya Nishihara
ea5724ad42 util: extract stub function to get mtime with second accuracy
This function is trivial but will need a long comment why it can't use
st.st_mtime. See the next patch for details.
2015-10-04 22:25:29 +09:00
Matt Harbison
bb1dafe069 util: extract stringmatcher() from revset
This is used to match against tags, bookmarks, etc in revsets.  It will be used
in a future patch to do the same tag matching in templater.
2015-08-22 22:52:18 -04:00
Gregory Szorc
fd27bc1367 util.chunkbuffer: avoid extra mutations when reading partial chunks
Previously, a read(N) where N was less than the length of the first
available chunk would mutate the deque instance twice and allocate a new
str from the slice of the existing chunk. Profiling drawed my attention
to these as a potential hot spot during changegroup reading.

This patch makes the code more complicated in order to avoid the
aforementioned 3 operations.

On a pre-generated mozilla-central gzip bundle, this series has the
following impact on `hg unbundle` performance on my MacBook Pro:

before: 358.21 real       317.69 user        38.49 sys
after:  301.57 real       262.69 user        37.11 sys
delta:  -56.64 real       -55.00 user        -1.38 sys
2015-10-05 17:36:32 -07:00
Gregory Szorc
c98628be54 util.chunkbuffer: refactor chunk handling logic
This will make the next patch easier to read. It provides no benefit on
its own.
2015-10-05 16:34:47 -07:00
Gregory Szorc
fdf56bc121 util.chunkbuffer: special case reading everything
The new code results in simpler logic within the while loop. It is also
faster since we avoid performing operations on the queue and buf
collections. However, there shouldn't be any super hot loops for this
since the whole point of chunkbuffer is to avoid reading large amounts
of data at once. This does, however, make it easier to optimize
chunkbuffer in a subsequent patch.
2015-10-05 16:28:12 -07:00
Yuya Nishihara
697917e9d9 util.system: compare fileno to see if it needs stdout redirection
Future patches will reopen stdout to be line-buffered, so sys.stdout may
be different object than sys.__stdout__.
2015-10-03 14:57:24 +09:00
Pierre-Yves David
496ebe3ecb changegroup: use a different compression key for BZ in HG10
For "space saving", bundle1 "strip" the first two bytes of the BZ stream since
they always are 'BZ'. So the current code boostrap the uncompressor with 'BZ'.
This hack is impractical in more generic case so we move it in a dedicated
"decompression".
2015-09-23 11:33:30 -07:00
Siddharth Agarwal
209bbcf824 util: avoid mutable default arguments
I almost introduced a bug around this code by accidentally mutating a default
argument. There's no reason for these to exist.
2015-09-22 16:55:18 -07:00
Pierre-Yves David
47af8d73d3 compression: use 'None' for no-compression
This seems more idiomatic and clearer. We still support both None and 'UN' for
now because no user are migrated.
2015-09-15 17:53:28 -07:00
Pierre-Yves David
f052875fa0 changegroup: move all compressions utilities in util
We'll reuse the compression for other things (next target bundle2), so let's
make it more accessible and organised.
2015-09-15 17:35:32 -07:00
timeless@mozdev.org
5569d89fd9 util: capitalize Python in MBTextWrapper._wrap_chunks comment 2015-09-08 15:32:20 -04:00
Yuya Nishihara
7905f93556 util: extract function that parses timezone string
It will be used to parse a timezone argument passed to a template function.
2015-09-01 19:43:14 +09:00
timeless@mozdev.org
52eae47139 spelling: behaviour -> behavior 2015-08-28 10:53:55 -04:00
Siddharth Agarwal
532b96181d util: drop Python 2.4 compat by directly importing md5 and sha1
There's been a fair amount of cruft here over the years, which we can all
just get rid of now.
2015-10-24 15:56:16 -07:00
Pierre-Yves David
2c879590b7 bufferedinputpipe: remove N^2 computation of buffer length (issue4735)
The assumption that dynamically computing the length of the buffer was N^2, but
negligible because fast was False. So we drop the dynamic computation and
manually keep track of the buffer length.
2015-06-26 11:29:50 -07:00
Pierre-Yves David
4d8c7e6435 bufferedinputpipe: remove an outdate comment
This comment is the remains of a intermediate implementation using

  self._buffer += data

This implementation never made it to the repository and we can safely drop the
comment.
2015-06-27 11:51:25 -07:00
Gregory Szorc
5380dea2a7 global: mass rewrite to use modern exception syntax
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".

This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.

This patch was produced by running `2to3 -f except -w -n .`.
2015-06-23 22:20:08 -07:00
Pierre-Yves David
6fec33b555 util: add a simple poll utility
We'll use it to detect when a sshpeer have server output to be displayed.

The implementation is super basic because all case support is not the focus of
this series.
2015-05-20 18:00:05 -05:00
Pierre-Yves David
1447d73588 util: introduce a bufferedinputpipe utility
To restore real time server output through ssh, we need to using polling feature
(like select) on the pipes used to communicate with the ssh client. However
we cannot use select alongside python level buffering of these pipe (because we
need to know if the buffer is non-empty before calling select).

However, unbuffered performance are terrible, presumably because the 'readline'
call is issuing 'read(1)' call until it find a '\n'. To work around that we
introduces our own overlay that do buffering by hand, exposing the state of the
buffer to the outside world.

The usage of polling IO will be introduced later in the 'sshpeer' module. All
its logic will be very specific to the way mercurial communicate over ssh and
does not belong to the generic 'util' module.
2015-05-30 23:55:24 -07:00
Pierre-Yves David
e653dac6f9 util: allow to specify buffer size in popen4
We will need unbuffered IO to restore real time output with ssh peer.

Changeset b61b215fcfa8 seems to indicate playing with this value could be
dangerous, but does not indicate why.
2015-05-20 11:29:45 -05:00
Pierre-Yves David
c34d269931 util: drop the 'unpacker' helper
It is not helping anything anymore.
2015-05-18 23:43:36 -05:00
Pierre-Yves David
44c29cf85d MBTextWrapper: drop dedicated __init__ method
It was only there as a compatibility layer with a version of Python which we do
support anymore.
2015-05-18 16:56:04 -05:00
Pierre-Yves David
5af170b397 util: drop the compatibility with Python 2.4 unpacker
Python 2.4 compatibility have packed and sailed.
2015-05-18 16:54:21 -05:00
Augie Fackler
5b25c17d54 util: drop any() and all() polyfills 2015-05-16 14:37:24 -04:00
Martin von Zweigbergk
4127f67694 util: drop alias for collections.deque
Now that util.deque is just an alias for collections.deque, let's just
remove it.
2015-05-16 11:28:04 -07:00
Adrian Buehlmann
db680f7ce8 util: kill Python 2.4 deque.remove hack 2015-05-16 09:03:21 +02:00
Matt Mackall
ef1d6d97be util: use try/except/finally 2015-05-15 09:58:21 -05:00
Siddharth Agarwal
c4e0365d23 util.checkcase: don't abort on broken symlinks
One case where that would happen is while trying to resolve a subrepo, if the
path to the subrepo was actually a broken symlink. This bug was exposed by an
hg-git test.
2015-05-03 12:49:15 -07:00
FUJIWARA Katsunori
d376c990a1 util: add removedirs as platform depending function
According to 6b1369445b7b introducing "windows._removedirs()":

    If a hg repository including working directory is a reparse point
    (directory symlinked or a junction point), then using
    os.removedirs will remove the reparse point erroneously.

"windows._removedirs()" should be used instead of "os.removedirs()" on
Windows.

This patch adds "removedirs" as platform depending function to replace
"os.removedirs()" invocations for portability and safety
2015-04-11 00:47:09 +09:00
Drew Gottlieb
901ac5e726 util: move dirs() and finddirs() from scmutil to util
An upcoming commit requires that match.py be able to call scmutil.dirs(), but
when match.py imports scmutil, a dependency cycle is created. This commit
avoids the cycle by moving dirs() and its related finddirs() function from
scmutil to util, which match.py already depends on.
2015-04-06 14:36:08 -07:00
Siddharth Agarwal
16d94fc24b util: add normcase spec and fallback
These will be used in upcoming patches to efficiently create a dirstate
foldmap.
2015-04-01 00:38:56 -07:00
Augie Fackler
723d5ae42f util: add progress callback support to copyfiles 2015-03-19 10:24:22 -04:00
Yuya Nishihara
3dbf9536f7 sortdict: have update() accept either dict or iterable of key/value pairs
Future patches will make the templater store a sorted dict in the _hybrid object.
sortdict should be constructed from a sorted list.
2015-02-18 22:53:53 +09:00
André Klitzing
350700ad60 util: accept "now, today, yesterday" for dates even the locale is not english
Hi there!

Fixed date names are helpful for automated systems. So it is possible to
use english date parameter even if the underlying system uses another
locale.

We have here a jenkins with build jobs on different slaves that will do
some operations with "dates" parameter. Some systems uses English locale
and some systems uses German locale. So we needed to configure the job to
uses other date names.

As this is really annoying to keep the systems locale in mind for some
operations I looked into util.py. It would be helpful for automated systems
if the "default English" date names would even usable on other
locales.

I attached a simple patch for this.

Best regards
  André Klitzing
2015-02-24 14:12:13 +01:00
Matt Harbison
721f888f2f transaction: really disable hardlink backups (issue4546) 2015-03-02 10:31:22 -05:00
Matt Mackall
055206a473 transaction: disable hardlink backups (issue4546)
Causing troubles, simplest fix.
2015-03-02 00:12:29 -06:00
Wagner Bruna
41a26dac7c messages: quote "hg help" hints consistently 2015-01-17 22:01:14 -02:00
Pierre-Yves David
71304e633e copyfile: allow optional hardlinking
Some code paths use 'copyfiles' (full tree) for a single file to take advantage
of the best-effort-hard-linking parameter. We add similar parameter and logic
to 'copyfile' (single file) for this purpose.

The single file version have the advantage to overwrite the destination file if
it exists.
2015-01-05 12:39:09 -08:00
Matt Mackall
4accabb1fb unpacker: fix missing arg for py2.4 2015-01-14 16:57:00 -08:00
Matt Mackall
f10752833b unpacker: check the right exception type for 2.4 2015-01-13 16:15:02 -08:00
Matt Mackall
2537cb8fbf util: introduce unpacker
This allows taking advantage of Python 2.5+'s struct.Struct, which
provides a slightly faster unpack due to reusing formats. Sadly,
.unpack_from is significantly slower.
2015-01-10 21:18:31 -06:00
Mads Kiilerich
b420dd92b1 spelling: fixes from proofreading of spell checker issues 2014-04-17 22:47:38 +02:00
Pierre-Yves David
a756e8c469 util: add a 'nogc' decorator to disable the garbage collection
Garbage collection behave pathologically when creating a lot of containers. As
we do that more than once it become sensible to have a decorator for it. See
inline documentation for details.
2014-12-04 05:43:40 -08:00
FUJIWARA Katsunori
f60bafa1b3 vfs: add "notindexed" argument to invoke "ensuredir" with it in write mode
This patch uses "False" as default value of "notindexed" argument,
even though "vfs.makedir()" uses "True" for it, because "os.mkdir()"
doesn't set "_FILE_ATTRIBUTE_NOT_CONTENT_INDEXED" attribute to newly
created directories.
2014-11-19 18:35:14 +09:00
Yuya Nishihara
b2ed607f5e util.system: remove unused handling of onerr=ui
In our code, onerr is None or util.Abort.  It smells bad to overload ui and
exception class.
2014-11-08 13:14:19 +09:00
Sean Farley
93b998c77a sortdict: add insert method
Future patches will allow extensions to choose which order a namespace should
output in the log, so we add a way for sortdict to insert to a specific
location.
2014-10-15 12:39:19 -07:00
Sean Farley
8ea5f6192f sortdict: add iteritems method
Future patches will start using sortdict for log operations where order is
important. Adding iteritems removes the headache of having to remember to use
items() if the object is a sortdict.
2014-11-09 13:15:28 -08:00
Mads Kiilerich
523c87c1fe spelling: fixes from proofreading of spell checker issues 2014-04-17 22:47:38 +02:00
Siddharth Agarwal
c9db5b4295 util.fspath: use a dict rather than a linear scan for lookups
Previously, we'd scan through the entire directory listing looking for a
normalized match.  This is O(N) in the number of files in the directory. If we
decide to call util.fspath on each file in it, the overall complexity works out
to O(N^2). This becomes a problem with directories a few thousand files or
larger.

Switch to using a dictionary instead. There is a slightly higher upfront cost
to pay, but for cases like the above this is amortized O(1). Plus there is a
lower constant factor because generator comprehensions are faster than for
loops, so overall it works out to be a very small loss in performance for 1
file, and a huge gain when there's more.

For a large repo with around 200k files in it on a case-insensitive file
system, for a large directory with over 30,000 files in it, the following
command was tested:

ls | shuf -n $COUNT | xargs hg status

This command leads to util.fspath being called on $COUNT files in the
directory.

COUNT  before  after
    1   0.77s  0.78s
  100   1.42s  0.80s
 1000    6.3s  0.96s

I also tested with COUNT=10000, but before took too long so I gave up.
2014-10-24 11:39:39 -07:00
Wagner Bruna
779ceca4ff i18n: add hint to digest mismatch message 2014-10-23 12:35:10 -02:00
Yuya Nishihara
e7ee70da05 util.system: avoid buffering of subprocess output when it is piped
util.system() copies subprocess' output through pipe if output file is not
stdout.  Because a file iterator has internal buffering, output won't be
flushed until enough data is available.  Therefore, it could easily miss
important messages such as "waiting for lock".
2014-08-30 17:38:14 +02:00
Mike Hommey
d2b17ca844 util: add a file handle wrapper class that does hash digest validation
It is going to be used for the remote-changegroup feature in bundle2.
2014-10-16 17:03:21 +09:00
Mike Hommey
6acd9847bf util: add a helper class to compute digests
It is going to be used for the remote-changegroup feature in bundle2.
2014-10-16 17:02:51 +09:00
Mike Hommey
9741dad0cc util: move md5 back next to sha1 and allow to call it without an argument
This effectively backs out changeset 7582042d6cce.

The API change is done so that both util.sha1 and util.md5 can be called the
same way. The function is moved in order to use it for md5 checksumming for
an upcoming bundle2 feature.
2014-09-24 16:00:47 +09:00
Pierre-Yves David
60ead8ca90 util: fix sorteddict.pop
When using `.pop` on such object the list was not cleared of the popped key,
leading to crash.
2014-10-02 12:39:37 -05:00
Mads Kiilerich
e49fb83975 i18n: use datapath for i18n like for templates and help
To avoid circular module dependencies we initialize i18n from util when
datapath is found.
2014-09-28 16:57:47 +02:00
Mads Kiilerich
e9c0145df2 util: introduce datapath for getting the location of supporting data files
templates, help and locale data is normally stored as sub folders in the
directory containing the source of the mercurial module. In a frozen build they
live as sub folders next to 'hg.exe' and 'library.zip'.

These different kind of data were handled in different ways. Unify that by
introducing util.datapath. The value is computed from the environment and is
always used, so we just calculate the value on module load.
2014-09-28 16:57:06 +02:00
Mads Kiilerich
9b9c077bd6 util: move _hgexecutable a few lines, closer to where it is used 2014-09-28 16:57:06 +02:00
Gregory Szorc
db57d5e9d6 platform: implement readpipe()
Reading all available data from a pipe has a platform-dependent
implementation.

This patch establishes platform.readpipe() by copying the
inline implementation in sshpeer.readerr(). The implementations
for POSIX and Windows are currently identical. The POSIX
implementation will be changed in a subsequent patch.
2014-08-15 20:02:18 -07:00
Siddharth Agarwal
2b459094e1 util.re: add an escape method
The escape method in at least one of the modules called 're2' is in C. This
means it is significantly faster than the Python code written in 're'.

An upcoming patch will have benchmarks.
2014-07-15 15:14:45 -07:00
Siddharth Agarwal
dd9e1b721a util.re: move check for re2 into a separate method
We're going to use the same check for another method in an upcoming patch.
2014-07-15 15:01:52 -07:00
Siddharth Agarwal
fe4e13633a util: remove no longer used compilere 2014-07-15 14:52:40 -07:00
Siddharth Agarwal
d0cd5cb8c7 util: move compilere to a class
We do this to allow us to use descriptors for other related methods.

For now, util.compilere does the same thing. Upcoming patches will remove it.
2014-07-15 14:40:43 -07:00
Siddharth Agarwal
04ca7a05ee util: rename 're' to 'remod'
Upcoming patches will introduce a binding called 're'.
2014-07-15 14:35:19 -07:00
FUJIWARA Katsunori
5206a6fd25 util: replace 'ellipsis' implementation by 'encoding.trim'
Before this patch, 'util.ellipsis' tried to avoid splitting at
intermediate multi-byte sequence, but its implementation was incorrect.

Internal function '_ellipsis' trims specified unicode sequence not at
most maxlength 'columns in display', but at most maxlength number of
'unicode characters'.

    def _ellipsis(text, maxlength):
        if len(text) <= maxlength:
            return text, False
        else:
            return "%s..." % (text[:maxlength - 3]), True

In many encodings, number of unicode characters can be different from
columns in display.

This patch replaces 'ellipsis' implementation by 'encoding.trim',
which can trim string at most maxlength columns in display correctly,
even though specified string contains multi-byte characters.

'_ellipsis' is removed in this patch, because it is referred only from
'ellipsis'.
2014-07-06 02:56:41 +09:00
Angel Ezquerra
74f8eca049 config: move config.sortdict class into util
This makes it more natural to use the sortdict class from outside config.py.
2014-02-23 01:56:31 +01:00
FUJIWARA Katsunori
18631f0006 util: enable "hooks" to return list of the values returned from each hooks 2014-04-16 00:37:24 +09:00
Pierre-Yves David
a04ebd710a util: support None size in chunkbuffer.read()
When no size is provided, read the whole buffer. This aligns with the usual
behavior of `read()` in python.
2014-04-10 22:10:26 -07:00
FUJIWARA Katsunori
accc1366a9 util: add the code path to "cachefunc()" for the function taking no arguments
Before this patch, "util.cachefunc()" caches the value returned by the
specified function into dictionary "cache", even if the specified
function takes no arguments.

In such case, "cache" has at most one entry, and distinction between
entries in "cache" is meaningless.

This patch adds the code path to "cachefunc()" for the function taking
no arguments for efficiency: to store only one cached value, using
list "cache" is a little faster than using dictionary "cache".
2014-02-15 19:52:26 +09:00
Augie Fackler
483cc1f586 util: move from dict() construction to {} literals
The latter are both faster and more consistent across Python 2 and 3.
2014-03-12 13:19:20 -04:00
Mads Kiilerich
877ecd2425 util: debugstacktrace, flush before and after writing
Close another stream (default stdout, which often is buffered) before writing
to the primary stream (default stderr, which often is unbuffered). The primary
stream is also flushed after writing (in case it is buffered).

This fixes non-deterministic output order, especially on windows.
2014-02-20 02:38:36 +01:00
Siddharth Agarwal
22a22b291a util.url: add an 'islocal' method
This returns True if the URL represents a path that can be opened locally,
without needing to go through the entire URL open mechanism.
2014-02-03 14:47:41 -08:00
Mads Kiilerich
771c21f193 util: introduce util.debugstacktrace for showing a stack trace without crashing
This is often very handy when hacking/debugging.

Calling util.debugstacktrace('hey') from a place in hg will give something like:
  hey at:
   ./hg:38                                     in <module>
   /home/user/hgsrc/mercurial/dispatch.py:28   in run
   /home/user/hgsrc/mercurial/dispatch.py:65   in dispatch
   /home/user/hgsrc/mercurial/dispatch.py:88   in _runcatch
   /home/user/hgsrc/mercurial/dispatch.py:740  in _dispatch
   /home/user/hgsrc/mercurial/dispatch.py:514  in runcommand
   /home/user/hgsrc/mercurial/dispatch.py:830  in _runcommand
   /home/user/hgsrc/mercurial/dispatch.py:801  in checkargs
   /home/user/hgsrc/mercurial/dispatch.py:737  in <lambda>
   /home/user/hgsrc/mercurial/util.py:472      in check
...
2014-01-12 23:28:21 +01:00
Christian Ebert
a9aa17b61c util: remove unused realpath (issue4063)
util.realpath was in use for only 5 days from 17bc9a6bb165
until it was backed out in e60acde24a62 because it caused
issue3077 and issue3071.
2013-12-29 13:54:04 +00:00
Matt Mackall
05fd1f2542 merge with stable 2013-11-25 16:15:44 -06:00
Simon Heimberg
2143905ef7 util: url keeps backslash in paths
Backslashes (\) in paths were encoded to %C5 when converting from url to
string. This does not look nice for windows paths. And it introduces many
problems when running tests on windows.
2013-11-20 22:03:15 +01:00
Mads Kiilerich
32fefa2839 util: warn when adding paths ending with \
Paths ending with \ will fail the verification introduced in 0bc0c17d663e when
checking out on Windows ... and if it didn't fail it would probably not do what
the user expected.
2013-11-08 12:35:50 +01:00
Augie Fackler
9f876f6c89 cleanup: move stdlib imports to their own import statement
There are a few warnings still produced by my import checker, but
those are false positives produced by modules that share a name with
stdlib modules.
2013-11-06 16:48:06 -05:00
Matt Mackall
0ed8797e8c merge with stable 2013-11-16 12:44:28 -05:00
Matt Mackall
6e88e21cca date: allow %z in format (issue4040) 2013-11-07 15:24:23 -06:00
Mads Kiilerich
eabc047878 spelling: random spell checker fixes 2013-10-24 01:49:56 +08:00
Matt Mackall
7b8a7d221c merge with stable 2013-10-01 17:00:03 -07:00
Pierre-Yves David
a411302ece repoview: make propertycache.setcache compatible with repoview
Propertycache used standard attribute assignment. In the repoview case, this
assignment was forwarded to the unfiltered repo. This result in:
(1) unfiltered repo got a potentially wrong cache value,
(2) repoview never reused the cached value.

This patch replaces the standard attribute assignment by an assignment to
`objc.__dict__` which will bypass the `repoview.__setattr__`. This will not
affects other `propertycache` users and it is actually closer to the semantic we
need.

The interaction of `propertycache` and `repoview` are now tested in a python
test file.
2013-09-30 14:36:11 +02:00
Jeff Sickel
c1824b83d9 plan9: update util.py for cpython 2.7 build 2013-09-13 15:40:04 -05:00
Siddharth Agarwal
26051a2eee lrucachedict: implement clear() 2013-09-06 13:16:21 -07:00
Simon Heimberg
bbed227d30 util: check if re2 works before using it (issue 3964) 2013-07-01 06:50:58 +02:00
Bryan O'Sullivan
101b24f17b util: add an optional timestamp parameter to makedate
This will be used by the upcoming shelve extension.
2013-06-03 17:20:45 -07:00
Bryan O'Sullivan
4143ebfdee util: rename ct variable in makedate to timestamp 2013-06-03 17:20:44 -07:00
Bryan O'Sullivan
5acd0ede31 summary: augment output with info from extensions 2013-05-14 11:23:15 -07:00
Bryan O'Sullivan
502e5bc1d1 util: migrate fileset._sizetoint to util.sizetoint
The size counting code introduced in 233431858f4c duplicated existing
(but unknown-to-me) code in fileset, so prepare to eliminate the
duplication.
2013-05-14 15:16:43 -07:00
Angel Ezquerra
19a754cb08 util: add notindexed optional parameter to makedirs function 2013-02-16 11:44:13 +01:00
Bryan O'Sullivan
8038ef7b7e util: remove unreachable code
Found by Cython.
2013-04-12 19:48:07 -07:00
Bryan O'Sullivan
e2ab8435d3 util: remove no-op assignment
Found by Cython.
2013-04-12 19:33:48 -07:00
Mads Kiilerich
c7ab477d55 util: improve doc for checkcase 2013-02-11 00:43:12 +01:00
Bryan O'Sullivan
f023d03282 util: add functions to check symlink/exec bits
These are not yet used.
2013-04-03 11:35:27 -07:00
Bryan O'Sullivan
ced9b7970b util: add flag support to compilere 2013-03-11 12:06:13 -07:00
Durham Goode
97bdc357fb sshpeer: store subprocess so it cleans up correctly
When running 'hg pull --rebase', I was seeing this exception 100% of the
time as the python process was closing down:

Exception TypeError: TypeError("'NoneType' object is not callable",) in
<bound method Popen.__del__ of <subprocess.Popen object at 0x937c10>> ignored

By storing the subprocess on the sshpeer, the subprocess seems to clean up
correctly, and I no longer see the exception. I have no idea why this actually
works, but I get a 0% repro if I store the subprocess in self.subprocess,
and a 100% repro if I store None in self.subprocess.

Possibly related to issue 2240.
2013-03-08 16:59:36 -08:00
Bryan O'Sullivan
95f0609257 util: add a timed function for use during development
I often want to measure the cost of a function call before/after
an optimization, where using top level "hg --time" timing introduces
enough other noise that I can't tell if my efforts are having an
effect.

This decorator allows a developer to measure a function's cost with
finer granularity.
2013-02-28 13:11:42 -08:00
Bryan O'Sullivan
ae3da7aa9a util: generalize bytecount to unitcountfn
This gives us a function we can reuse to count units of other kinds.
2013-02-28 12:51:18 -08:00
Bryan O'Sullivan
9b9339ed49 util: make ensuredirs safer against races 2013-02-13 12:20:10 -08:00
Bryan O'Sullivan
ede7482a3a scmutil: create directories in a race-safe way during update
With the new parallel update code, it is possible for multiple
workers to try to create a hierarchy of directories at the same
time. This is hard to trigger in general, but most likely during
initial checkout.

To deal with these races, we introduce a new ensuredirs function
whose contract is to ensure that a directory hierarchy exists - it
will ignore a failure that implies that the desired directory already
exists.
2013-02-11 16:15:12 -08:00
Augie Fackler
e8c901fc2d parsedate: understand "now" as a shortcut for the current time 2013-02-09 15:39:22 -06:00
Siddharth Agarwal
b13982495e util: add an LRU cache dict
In certain cases we would like to have a cache of the last N results of a
given computation, where N is small. This will be used in an upcoming patch to
increase the size of the manifest cache from 1 to 3.
2013-02-09 15:41:46 +00:00
Paul Cavallaro
a9ed690f88 dates: support 'today' and 'yesterday' in parsedate (issue3764)
Adding support to parsedate in util module to understand the more idiomatic
dates 'today' and 'yesterday'.

Added unified tests and docstring tests for added functionality.
2013-01-23 09:51:45 -08:00
Mads Kiilerich
33393d7e47 util: copyfile: remove dest before copying
This prevents spurious problems writing to locked files on Windows.
2013-01-10 00:44:23 +01:00
Bryan O'Sullivan
58c82f12c9 osutil: write a C implementation of statfiles for unix
This makes a big difference to performance.

In a clean working directory containing 170,000 files, performance of
"hg --time diff" improves from 2.38 seconds to 1.69.
2012-12-03 12:40:24 -08:00
Pierre-Yves David
b25a880a8e clfilter: add a propertycache that must be unfiltered
Some of the localrepo property caches must be computed unfiltered and
stored globally. Some others must see the filtered version and store data
relative to the current filtering.

This changeset introduces two classes `unfilteredpropertycache`
and `filteredpropertycache` for this purpose. A new function
`hasunfilteredcache` is introduced for unambiguous checking for cached
values on unfiltered repos.

A few tweaks are made to the property cache class to allow overriding
the way the computed value is stored on the object.

Some logic relative to _tagcaches is cleaned up in the process.
2012-10-08 20:02:20 +02:00
Matt Mackall
af5b4b62cf util: make chunkbuffer non-quadratic on Windows
The old str-based += collector performed very nicely on Linux, but
turns out to be quadratically expensive on Windows, causing
chunkbuffer to dominate in profiles.

This list-based version has been measured to significantly improve
performance with large chunks on Windows, with negligible overall
overhead on Linux (though microbenchmarks show it to be about 50% slower).

This may increase memory overhead where += didn't behave quadratically. If we
want to gather up 1G of data to join, we temporarily have 1G in our
list and 1G in our string.
2012-11-26 15:42:52 -06:00
Bryan O'Sullivan
e3555667b8 util: implement a faster os.path.split for posix systems
This is not yet used.
2012-09-14 12:08:17 -07:00
Mads Kiilerich
2372d51b68 fix wording and not-completely-trivial spelling errors and bad docstrings 2012-08-15 22:39:18 +02:00
Mads Kiilerich
2f4504e446 fix trivial spelling errors 2012-08-15 22:38:42 +02:00
Ross Lagerwall
661779d660 util: replace util.nulldev with os.devnull
Python since 2.4 has supported os.devnull so having util.nulldev
is unnecessary.
2012-08-04 07:14:40 +02:00
Bryan O'Sullivan
9f3858de6e util: delegate seek and tell methods of atomictempfile 2012-07-23 15:38:43 -07:00
Adrian Buehlmann
0fe77b0110 util, posix: eliminate encodinglower and encodingupper
bffd8f8dfc85 claims this was needed "to avoid cyclic dependency", but there is
no cyclic dependency.

windows.py already imports encoding, posix.py can import it too, so we can
simply use encoding.upper in windows.py and in posix.py.

(this is a partial backout of bffd8f8dfc85)
2012-07-18 14:41:58 +02:00
Bryan O'Sullivan
3f45806d34 matcher: use re2 bindings if available
There are two sets of Python re2 bindings available on the internet;
this code works with both.

Using re2 can greatly improve "hg status" performance when a .hgignore
file becomes even modestly complex.

Example: "hg status" on a clean tree with 134K files, where "hg
debugignore" reports a regexp 4256 bytes in size.

  no .hgignore: 1.76 sec
  Python re:    2.79
  re2:          1.82

The overhead of regexp matching drops from 1.03 seconds with stock
re to 0.06 with re2.

(For comparison, a git repo with the same contents and .gitignore
file runs "git status -s" in 1.71 seconds, i.e. only slightly faster
than hg with re2.)
2012-06-01 15:26:20 -07:00
Bryan O'Sullivan
1de6d211c8 util: simplify queue management in chunkbuffer
This also fixes a small wire protocol performance regression.
2012-06-05 16:52:20 -07:00
Bryan O'Sullivan
abdf4a8227 util: subclass deque for Python 2.4 backwards compatibility
It turns out that Python 2.4's deque type is lacking a remove method.
We can't implement remove in terms of find, because it doesn't have
find either.
2012-06-01 17:05:31 -07:00
Bryan O'Sullivan
bef5b61512 cleanup: use the deque type where appropriate
There have been quite a few places where we pop elements off the
front of a list.  This can turn O(n) algorithms into something more
like O(n**2).  Python has provided a deque type that can do this
efficiently since at least 2.4.

As an example of the difference a deque can make, it improves
perfancestors performance on a Linux repo from 0.50 seconds to 0.36.
2012-05-15 10:46:23 -07:00
Matt Mackall
f4a789ba4d merge with stable 2012-05-21 17:35:28 -05:00
Augie Fackler
3dc5160169 util: fix bad variable use in bytecount introduced by ad5e3bec298e 2012-05-21 14:24:24 -05:00
Brodie Rao
7f47d4e347 check-code: ignore naked excepts with a "re-raise" comment
This also promotes the naked except check from a warning to an error.
2012-05-13 13:18:06 +02:00
Brodie Rao
46ce54af4d cleanup: replace more naked excepts with more specific ones 2012-05-13 13:17:31 +02:00
Brodie Rao
c577fac135 cleanup: replace naked excepts with more specific ones 2012-05-12 16:02:45 +02:00
Matt Mackall
d38924097e util: create bytecount array just once
This avoids tons of gettext calls on workloads that call bytecount a lot.
2012-04-12 20:22:18 -05:00
Steven Stallion
d79ff306e5 plan9: initial support for plan 9 from bell labs
This patch contains support for Plan 9 from Bell Labs. A README is
provided in contrib/plan9 which describes the port in greater detail.
A new extension is also provided named factotum which permits the
factotum(4) authentication agent to provide credentials for HTTP
repositories. This extension is also applicable to other POSIX
platforms which make use of Plan 9 from User Space (aka plan9ports).
2012-04-08 12:43:41 -07:00
Matteo Capobianco
8305e8d305 templates/filters: extracting the user portion of an email address
Currently, the 'user' filter is using util.shortuser(text) (which clearly
doesn't extract only the user portion of an email address, even though the
help text says it does).

The new 'emailuser' filter uses the new util.emailuser(text) function which,
instead, does exactly that.

The help text on the 'user' filter has been modified accordingly.
2012-03-28 16:06:20 +02:00
FUJIWARA Katsunori
3abfeb7e54 icasefs: rewrite comment to explain situtation precisely 2011-12-24 00:52:06 +09:00
FUJIWARA Katsunori
b180efb872 icasefs: follow standard cache look up pattern 2011-12-24 00:51:14 +09:00
FUJIWARA Katsunori
1edd7d1c6d icasefs: disuse length check against un-normcase()-ed filenames
this patch disuses length check against un-normcase()-ed filenames
gotten by "os.listdir()", because there is no assurance that
filesystem stores filenames normalized except in letter case, even
though some case insensitive filesystems (in some environment, for
some language setting) store them in such manner.
2011-12-24 00:50:56 +09:00
FUJIWARA Katsunori
2d248cd109 icasefs: avoid path-absoluteness/existance check in util.fspath() for efficiency
'dirstate._normalize()', the only caller of 'util.fspath()', has
already confirmed exsistance of specified file as relative to root.

so, this patch omits path-absoluteness/existance check from
'util.fspath()'.
2011-12-16 21:09:40 +09:00
FUJIWARA Katsunori
b5973249bd icasefs: retry directory scan once for already invalidated cache
some hg operation (e.g.: qpush) create new files after first
dirstate.walk()-ing, and it invalidates _fspathcache for fspath().

then, fspath() will fail to look up specified name in _fspathcache.

this causes case preservation breaking, because parts of already
normcase()-ed path are used as result at that time.

in this case, file creation and writing out should be done before
fspath() invocation, so the second invocation of os.listdir() has not
so much impact on runtime performance.
2011-12-16 21:09:40 +09:00
Matt Mackall
7cf4e6eacb merge with stable 2011-12-16 19:05:59 -06:00
FUJIWARA Katsunori
fe972435d4 i18n: use encoding.lower/upper for encoding aware case folding
this patch uses encoding.lower/upper for case folding, because ones of
str can not fold case of non ascii characters correctly.

to avoid cyclic dependency and to encapsulate logic of normcase in
each platforms, this patch introduces encodinglower/encodingupper in
both posix/windows specific files.

this patch does not change implementation of normcase() in posix.py,
because we do not know the encoding of filenames on POSIX.

some "normcase()" are excluded from function wrap list in
hgext/win32mbcs.py, because they become encoding aware by this patch.
2011-12-16 21:09:41 +09:00
FUJIWARA Katsunori
055136813d icasefs: avoid normcase()-ing in util.fspath() for efficiency
'dirstate._normalize()', the only caller of 'util.fspath()', has
already normcase()-ed path before invocation of it.

normcase()-ed root can be cached on dirstate side, too.

so, this patch changes 'util.fspath()' API specification to avoid
normcase()-ing in it.
2011-12-16 21:09:40 +09:00
FUJIWARA Katsunori
c71595845e icasefs: use util.normcase() instead of lower() or os.path.normcase in fspath
this also avoids lower()-ing on each path components by reuse the path
normcase()-ed at beginning of function.
2011-12-16 21:09:40 +09:00
FUJIWARA Katsunori
f9ca02bd18 icasefs: consider as case sensitive if there is no counterevidence, for safety
for safety, this patch prevents case-less name from misleading into
case insensitivity, even though such names should not be used.
2011-12-16 21:09:40 +09:00
Matt Mackall
9f8ee10163 util: don't mess with builtins to emulate buffer() 2011-12-15 15:27:11 -06:00
Matt Mackall
49b0ffe198 util: clean up function ordering 2011-12-15 14:59:22 -06:00
Patrick Mezard
3a0effcd7b util: fix url.__str__() for windows file URLs
Before:

  >>> str(url('file:///c:/tmp/foo/bar'))
  'file:c%3C/tmp/foo/bar'

After:

  >>> str(url('file:///c:/tmp/foo/bar'))
  'file:///c%3C/tmp/foo/bar'

The previous behaviour had no effect on mercurial itself (clone command for
instance) because we fortunately called .localpath() on the parsed URL.
hgsubversion was not so lucky and cloning a local subversion repository on
Windows no longer worked on the default branch (it works on stable because
2b62605189dc defeats the hasdriveletter() test in url class).

I do not know if the %3C is correct or not but svn accepts file:// URLs
containing it. Mads fixed it in 2b62605189dc, so we can always backport should
the need arise.
2011-12-04 18:22:25 +01:00
Dmitry Panov
6649925e57 makedate: wrong timezone offset if DST rules changed this year (issue2511)
Python's time module sets timezone and altzone based on UTC offsets of
two dates: first and middle day of the current year. This approach
doesn't work on a year when DST rules change.

For example Russia abandoned winter time this year, so the correct UTC
offset should be +4 now, but time.timezone returns 3 hours difference
because that's what it was on 01.01.2011.

Related python issue: http://bugs.python.org/issue1647654
2011-11-13 00:29:26 +00:00
Mads Kiilerich
5d7000644a url: handle file://localhost/c:/foo "correctly"
The path was parsed correctly, but localpath prepended an extra '/' (as in
'/c:/foo') because it assumed it was an absolute unix path.
2011-11-16 00:10:56 +01:00
Matt Mackall
3eab62750e dirstate: fix case-folding identity for traditional Unix
We used to use os.path.normcase which was a no-op, which was unhelpful
for cases like VFAT on Linux.
2011-11-15 14:25:11 -06:00
Matt Mackall
9580de9b45 util: add a doctest for empty sha() calls 2011-10-31 15:41:39 -05:00
Matt Mackall
e82c2e671f merge with stable 2011-12-05 17:48:40 -06:00
Matt Mackall
75db0d196a merge with stable 2011-11-17 16:53:17 -06:00
Matt Mackall
bbf72a4e6e util: allow sha1() with no args
Normally this works because we replace util.sha1 with hashlib.sha1
after first use, but if the first user doesn't provide an arg, it
breaks.
2011-10-31 14:22:11 -05:00
Matt Mackall
226e1ff7c0 util: don't complain about '..' in path components not working on Windows 2011-10-24 16:57:14 -05:00
Matt Mackall
3a9838cebc merge with stable 2011-11-15 14:33:06 -06:00
Mads Kiilerich
6485196281 util: don't encode ':' in url paths
':' has no special meaning in paths, so there is no need for encoding it.

Not encoding ':' makes it easier to test on windows.
2011-11-07 03:25:10 +01:00
Matt Mackall
e538620d00 merge with stable 2011-09-27 18:50:18 -05:00
Kevin Gessner
d0a563a1b5 util: fix crash converting an invalid future date to string
Post-2038 timestamps cannot be handled on 32-bit architectures. Clamp
such dates to the maximum 32-bit timestamp.
2011-09-23 09:02:27 -07:00
Mads Kiilerich
35dbb9abb2 http: handle push of bundles > 2 GB again (issue3017)
It was very elegant that httpsendfile implemented __len__ like a string. It was
however also dangerous because that protocol can't handle sizes bigger than 2 GB.
Mercurial tried to work around that, but it turned out to be too easy to
introduce new errors in this area.

With this change __len__ is no longer implemented at all and the code will work
the same way for short and long posts.
2011-09-21 22:52:00 +02:00
Matt Mackall
19be20e2ef url: parse fragments first (issue2997) 2011-09-10 17:49:19 -05:00
FUJIWARA Katsunori
5b5a083f16 i18n: calculate terminal columns by width information of each characters
neither number of 'bytes' in any encoding nor 'characters' is
appropriate to calculate terminal columns for specified string.

this patch modifies MBTextWrapper for:

  - overriding '_wrap_chunks()' to make it use not built-in 'len()'
    but 'encoding.colwidth()' for columns of string

  - fixing '_cutdown()' to make it use 'encoding.colwidth()' instead
    of local, similar but incorrect implementation

this patch also modifies 'encoding.py':

  - dividing 'colwith()' into 2 pieces: one for calculation columns of
    specified UNICODE string, and another for rest part of original
    one. the former is used from MBTextWrapper in 'util.py'.

  - preventing 'colwidth()' from evaluating HGENCODINGAMBIGUOUS
    configuration per each invocation: 'unicodedata.east_asian_width'
    checking is kept intact for reducing startup cost.
2011-08-27 04:56:12 +09:00
Mads Kiilerich
ec483cfecb util: wrap lines with multi-byte characters correctly (issue2943)
This re-introduces the unicode conversion what was lost in e5976ee55f4b 5 years
ago and had the comment:
  To avoid corrupting multi-byte characters in line, we must wrap
  a Unicode string instead of a bytestring.
2011-08-06 23:52:20 +02:00
Patrick Mezard
9aadd2540f http: strip credentials from urllib2 manager URIs (issue2885)
urllib2 password manager does not strip credentials from URIs registered with
add_password() and compare them with stripped URIs in find_password(). Remove
credentials from URIs returned by util.url.authinfo(). It sometimes works when
no port was specified as the URI host is registered too.
2011-08-05 21:05:40 +02:00
Mads Kiilerich
965df356e5 url: really handle urls of the form file:///c:/foo/bar/ correctly
8264e5172141 made sure that paths that seemed to start with a windows drive
letter would not get an extra leading slash.

localpath should thus not try to handle this case by removing a leading slash,
and this special handling is thus removed.

(The localpath handling of this case was wrong anyway, because paths that look
like they start with a windows drive letter can't have a leading slash.)

A quick verification of this is to run 'hg id file:///c:/foo/bar/'.
2011-08-04 02:51:29 +02:00
Benoit Boissinot
573390f2a4 url: store and assume the query part of an url is in escaped form (issue2921) 2011-07-31 21:00:44 +02:00
Simon Heimberg
d6ebf02048 util: fix finding of hgexecutable
The version introduced in 0070c1dc1b72 would for example return thg
(thanks to Mads Kiilerich for pointing to this)
2011-07-23 06:18:18 +02:00
Matt Mackall
757bb24a98 merge with stable 2011-09-10 17:56:42 -05:00
Simon Heimberg
97acb3896d util: improve finding of hgexecutable
check the module __main__ before looking on the default path
2011-07-23 06:18:18 +02:00
Martin Geisler
0bbf634d5b merge with stable 2011-08-30 15:22:10 +02:00
Adrian Buehlmann
f334675c97 util: postpone and reorder parent calculation in makedirs 2011-08-25 11:03:16 +02:00
Greg Ward
bc1dfb1ac9 atomictempfile: make close() consistent with other file-like objects.
The usual contract is that close() makes your writes permanent, so
atomictempfile's use of close() to *discard* writes (and rename() to
keep them) is rather unexpected. Thus, change it so close() makes
things permanent and add a new discard() method to throw them away.
discard() is only used internally, in __del__(), to ensure that writes
are discarded when an atomictempfile object goes out of scope.

I audited mercurial.*, hgext.*, and ~80 third-party extensions, and
found no one using the existing semantics of close() to discard
writes, so this should be safe.
2011-08-25 20:21:04 -04:00
Mads Kiilerich
f701cb534b util.makedirs: make recursion simpler and more stable (issue2948)
Before, makedirs could call itself recursively with the same path name it was
given, relying on sane file system behavior to terminate the recursion. That
could cause infinite recursion on insane file systems.

Instead we now call mkdir explicitly after having created parent directory
recursively. Exceptions from this mkdir is not swallowed.
2011-08-22 00:42:38 +02:00
Mads Kiilerich
7ebc89bb63 util.makedirs: propagate chmod exceptions
The existing exception handling was intended to handle mkdir errors. Strange
chmod exceptions could thus have strange consequences - or be swallowed.
2011-08-22 00:35:42 +02:00
Mads Kiilerich
eec07d1052 util: wrap lines with multi-byte characters correctly (issue2943)
This re-introduces the unicode conversion what was lost in e5976ee55f4b 5 years
ago and had the comment:
  To avoid corrupting multi-byte characters in line, we must wrap
  a Unicode string instead of a bytestring.
2011-08-06 23:52:20 +02:00
Patrick Mezard
5e4ec42fdf http: explain why the host is passed to urllib2 password manager
The original comment was in url.getauthinfo() and was lost in 1835264d98c1.
2011-08-06 14:10:59 +02:00
Matt Mackall
96e41d94f5 merge with stable 2011-08-05 16:07:51 -05:00
Adrian Buehlmann
50665a9994 util: move copymode into posix.py and windows.py
reducing it to a NOP on Windows.

This eliminates a pointless stat call on Windows and reduces the risk of
interferring with other processes (e.g. AV-scanners, file change watchers).

See also http://mercurial.selenic.com/wiki/UnlinkingFilesOnWindows, item 2d
2011-08-02 13:18:56 +02:00
Adrian Buehlmann
cac3521896 util: factor new function copymode out of mktempcopy 2011-08-02 12:29:48 +02:00
Matt Mackall
e75325116a merge with stable 2011-08-01 10:54:34 -05:00
Matt Mackall
8fc00f653d url: handle urls of the form file:///c:/foo/bar/ correctly 2011-07-22 17:11:35 -05:00
Mads Kiilerich
e8138203dd util: rename the util.localpath that uses url to urllocalpath (issue2875)
util is never imported by any other name than util, so this is mostly just a
simple search and replace from util.localpath to util.urllocalpath (assuming
other uses of util.localpath already has been renamed).
2011-07-01 17:37:09 +02:00
Matt Mackall
8e3367eb55 subrepos: be smarter about what's an absolute path (issue2808) 2011-06-29 16:01:06 -05:00
Matt Mackall
6f9d587fae url: catch UNC paths as yet another Windows special case (issue2808) 2011-06-20 16:45:33 -05:00
Idan Kamara
5e2d608efc dispatch: write shell alias output to ui out descriptor 2011-06-07 13:39:09 +03:00