sapling/eden/scm/edenscm
Mark Thomas 5168c29e12 encoding: use correct output encoding on windows
Summary:
On Windows, there are *two* 8-bit encodings for each process.

* The ANSI code page is used for all `...A` system calls, and this is what
  Mercurial uses internally.  It can be overridden using the `--encoding`
  command line option.

* The OEM code page is used when outputing to the console.  Mercurial has no
  concept of this, and instead renders to the console using the ANSI code page,
  which results in mojibake like "Θ" instead of "é".

Add the concept of an `outputencoding`.  If this differs from `encoding`, we
convert from the local encoding to the output encoding before writing to the
console.

On non-Windows platforms, this defaults to the same encoding as the local encoding,
so this is a no-op unless `--outputencoding` is manually specified.

On Windows, this defaults to the codepage given by `GetOEMCP`, causing output
to be converted to the OEM codepage before being printed.

For ordinary strings, the local encoded version is wrapped by `localstr` if the
encoding does not round-trip cleanly.  This means the output encoding works
even if the character is not represented in the local encoding.

Unfortunately, the templater is not localstr-clean, which means strings can get
flattened down to the local encoding and the original code points are lost.  In
this case we can only output characters which are in the intersection of the
encoding and the output encoding.

Most US English Windows systems use cp1252 for the ANSI code page and cp437 for
the OEM code page.  These both contain many accented characters, so users with
accented characters in their names will now see them correctly rendered.

All of this only applies to Python 2.7.  In Python 3, everything is Unicode,
the `--encoding` and `--outputencoding` options do nothing, and it just works.

Reviewed By: quark-zju, ikostia

Differential Revision: D19951381

fbshipit-source-id: d5cb8b5bfe2bc131b2e6c3b892137a48b2139ca9
2020-02-20 04:28:48 -08:00
..
hgdemandimport Upgrade Pyre version for eden to 2927613de6d20ee2d66e98124f3834812475e122 2020-02-19 15:05:25 -08:00
hgext rage: force use of utf-8 and lines-square graph renderer 2020-02-20 04:28:48 -08:00
mercurial encoding: use correct output encoding on windows 2020-02-20 04:28:48 -08:00
__init__.py demandimport: re-enable 2020-02-05 11:23:29 -08:00
__main__.py edenscm: add a main module 2020-01-30 18:09:14 -08:00
traceimport.py pyre: add stub for "bindings" 2020-01-31 13:18:54 -08:00