mirror of
https://github.com/facebook/sapling.git
synced 2024-10-09 16:31:02 +03:00
9cce255bec
Mercurial has problem around text wrapping/filling in MBCS encoding environment, because standard 'textwrap' module of Python can not treat it correctly. It splits byte sequence for one character into two lines. According to unicode specification, "east asian width" classifies characters into: W(ide), N(arrow), F(ull-width), H(alf-width), A(mbiguous) W/N/F/H can be always recognized as 2/1/2/1 bytes in byte sequence, but 'A' can not. Size of 'A' depends on language in which it is used. Unicode specification says: If the context(= language) cannot be established reliably they should be treated as narrow characters by default but many of class 'A' characters are full-width, at least, in Japanese environment. So, this patch treats class 'A' characters as full-width always for safety wrapping. This patch focuses only on MBCS safe-ness, not on writing/printing rule strict wrapping for each languages MBCS sensitive textwrap class is originally implemented by ITO Nobuaki <daydream.trippers@gmail.com>.
175 lines
4.8 KiB
Plaintext
175 lines
4.8 KiB
Plaintext
adding changesets
|
|
adding manifests
|
|
adding file changes
|
|
added 2 changesets with 2 changes to 1 files
|
|
(run 'hg update' to get a working copy)
|
|
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
|
|
% should fail with encoding error
|
|
M a
|
|
? latin-1
|
|
? latin-1-tag
|
|
? utf-8
|
|
transaction abort!
|
|
rollback completed
|
|
abort: decoding near ' encoded: é': 'ascii' codec can't decode byte 0xe9 in position 20: ordinal not in range(128)!
|
|
% these should work
|
|
marked working directory as branch é
|
|
% hg log (ascii)
|
|
changeset: 5:db5520b4645f
|
|
branch: ?
|
|
tag: tip
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: latin1 branch
|
|
|
|
changeset: 4:9cff3c980b58
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: Added tag ? for changeset 770b9b11621d
|
|
|
|
changeset: 3:770b9b11621d
|
|
tag: ?
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: utf-8 e' encoded: ?
|
|
|
|
changeset: 2:0572af48b948
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: latin-1 e' encoded: ?
|
|
|
|
changeset: 1:0e5b7e3f9c4a
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: koi8-r: ????? = u'\u0440\u0442\u0443\u0442\u044c'
|
|
|
|
changeset: 0:1e78a93102a3
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: latin-1 e': ? = u'\xe9'
|
|
|
|
% hg log (latin-1)
|
|
changeset: 5:db5520b4645f
|
|
branch: é
|
|
tag: tip
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: latin1 branch
|
|
|
|
changeset: 4:9cff3c980b58
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: Added tag é for changeset 770b9b11621d
|
|
|
|
changeset: 3:770b9b11621d
|
|
tag: é
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: utf-8 e' encoded: é
|
|
|
|
changeset: 2:0572af48b948
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: latin-1 e' encoded: é
|
|
|
|
changeset: 1:0e5b7e3f9c4a
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: koi8-r: ÒÔÕÔØ = u'\u0440\u0442\u0443\u0442\u044c'
|
|
|
|
changeset: 0:1e78a93102a3
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: latin-1 e': é = u'\xe9'
|
|
|
|
% hg log (utf-8)
|
|
changeset: 5:db5520b4645f
|
|
branch: é
|
|
tag: tip
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: latin1 branch
|
|
|
|
changeset: 4:9cff3c980b58
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: Added tag é for changeset 770b9b11621d
|
|
|
|
changeset: 3:770b9b11621d
|
|
tag: é
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: utf-8 e' encoded: é
|
|
|
|
changeset: 2:0572af48b948
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: latin-1 e' encoded: é
|
|
|
|
changeset: 1:0e5b7e3f9c4a
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: koi8-r: ÒÔÕÔØ = u'\u0440\u0442\u0443\u0442\u044c'
|
|
|
|
changeset: 0:1e78a93102a3
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: latin-1 e': é = u'\xe9'
|
|
|
|
% hg tags (ascii)
|
|
tip 5:db5520b4645f
|
|
? 3:770b9b11621d
|
|
% hg tags (latin-1)
|
|
tip 5:db5520b4645f
|
|
é 3:770b9b11621d
|
|
% hg tags (utf-8)
|
|
tip 5:db5520b4645f
|
|
é 3:770b9b11621d
|
|
% hg branches (ascii)
|
|
? 5:db5520b4645f
|
|
default 4:9cff3c980b58 (inactive)
|
|
% hg branches (latin-1)
|
|
é 5:db5520b4645f
|
|
default 4:9cff3c980b58 (inactive)
|
|
% hg branches (utf-8)
|
|
é 5:db5520b4645f
|
|
default 4:9cff3c980b58 (inactive)
|
|
% hg log (utf-8)
|
|
changeset: 5:db5520b4645f
|
|
branch: é
|
|
tag: tip
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: latin1 branch
|
|
|
|
changeset: 4:9cff3c980b58
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: Added tag é for changeset 770b9b11621d
|
|
|
|
changeset: 3:770b9b11621d
|
|
tag: é
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: utf-8 e' encoded: é
|
|
|
|
changeset: 2:0572af48b948
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: latin-1 e' encoded: é
|
|
|
|
changeset: 1:0e5b7e3f9c4a
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: koi8-r: ртуть = u'\u0440\u0442\u0443\u0442\u044c'
|
|
|
|
changeset: 0:1e78a93102a3
|
|
user: test
|
|
date: Mon Jan 12 13:46:40 1970 +0000
|
|
summary: latin-1 e': И = u'\xe9'
|
|
|
|
% hg log (dolphin)
|
|
abort: unknown encoding: dolphin, please check your locale settings
|
|
abort: decoding near 'é': 'ascii' codec can't decode byte 0xe9 in position 0: ordinal not in range(128)!
|
|
abort: branch name not in UTF-8!
|