Extension authors (notably at companies using hg) have been
cargo-culting the `testedwith = 'internal'` bit from hg's own
extensions, which then defeats our "file bugs over here" logic in
dispatch. Let's be more aggressive about trying to give extension
authors a hint about what testedwith should say.
This changeset fix the problem to use win32mbcs with mercurial 2.3 or
later.
The problem is brought by side effect of modification of
encoding.upper() (changeset 17236:8b2442729eb3) because upper() does
not accept unicode string argument. So wrapped util.normcase() which
uses upper() will fail. In other words, upper() and lower() are
unicode incompatible.
To fix this issue, this changeset adds new wrapper for reversed
conversion (unicode to str) for lower() and upper() to use them
safely.
this patch allows win32mbcs extension to be enabled on cygwin platform
for problematic character encodings.
on recent cygwin platform, even though
"os.path.supports_unicode_filenames" is False, "os.listdir()" and
other path manipulation functions can return the result correctly
decoded in unicode for invocations with unicode arguments, if locale
is configured properly.
existing code to check "os.path.supports_unicode_filenames" is kept to
prevent win32mbcs from being enabled on unexpected platform.
this patch uses encoding.lower/upper for case folding, because ones of
str can not fold case of non ascii characters correctly.
to avoid cyclic dependency and to encapsulate logic of normcase in
each platforms, this patch introduces encodinglower/encodingupper in
both posix/windows specific files.
this patch does not change implementation of normcase() in posix.py,
because we do not know the encoding of filenames on POSIX.
some "normcase()" are excluded from function wrap list in
hgext/win32mbcs.py, because they become encoding aware by this patch.
this patch uses upper() instead of lower() or os.path.normcase() for
case folding on Windows(NTFS), because lower-ing causes problems for
some languages on it.
see below for detail about problem of lower-ing:
https://blogs.msdn.com/b/michkap/archive/2005/01/16/353873.aspx
util.checkwinfilename() and util.checkosfilename() are wrapped.
The former, used by scmutil, is not friendly for Shift_JIS bytes.
The latter is an dynamic alias of checkwinfilename() used by other
modules.
Trying as much as possible to consistently:
- use a present tense predicate followed by a direct object
- verb referring directly to the functionality provided
(ie. not "add command that does this" but simple "do that")
- keep simple and to the point, leaving details for the long help
(width is tight, possibly even more so for translations)
Thanks to timeless, Martin Geisler, Rafael Villar Burke, Dan Villiom
Podlaski Christiansen and others for the helpful suggestions.
Add win32mbcs.encoding configuration option to specify the encoding to
use instead of encoding.encoding.
This option is useful for the users who want to write UTF-8 log
message on non UTF-8 path encoding environment.
Changes docstrings to begin with a lowercase word. Only docstrings
used in help output is changed.
Scripts are not expected to grep the output of 'hg help' so this
change should pose no problem with regard to the compatibility rules.
* Code cleanup by Matt.
* Fix the issue with case-insensitive fs support
by wrapping also util.fspath() and util.checkcase()
* Abort program when path conversion is failed.
The aim of this extension is to clear the problem related to having
0x5c in 2nd byte of encoded bytes. So this extension is usefull for:
* Japanese Windows user shift_jis encoding.
* Chinese Windows user using big5 encoding.
To use this extension, simply enable it without any customization.
Note that some important python built-in functions and mercurial
functions are altered for this extension to convert argument if need
to handle MBCS.