sapling/eden/hg-server/tests/test-encoding.t
Durham Goode 98d9269874 server: copy hg to a new hg-server directory
Summary:
Create a fork of the Mercurial code that we can use to build server
rpms. The hg servers will continue to exist for a few more months while we move
the darkstorm and ediscovery use cases off them. In the mean time, we want to
start making breaking changes to the client, so let's create a stable copy of
the hg code to produce rpms for the hg servers.

The fork is based off c7770c78d, the latest hg release.

This copies the files as is, then adds some minor tweaks to get it to build:
- Disables some lint checks that appear to be bypassed by path
- sed replace eden/scm with eden/hg-server
- Removed a dependency on scm/telemetry from the edenfs-client tests since
  scm/telemetry pulls in the original eden/scm/lib/configparser which conflicts
  with the hg-server conflict parser.

allow-large-files

Reviewed By: quark-zju

Differential Revision: D27632557

fbshipit-source-id: b2f442f4ec000ea08e4d62de068750832198e1f4
2021-04-09 10:09:06 -07:00

158 lines
4.1 KiB
Perl

#require py2
Test character encoding
$ hg init t
$ cd t
we need a repo with some legacy latin-1 changesets
$ hg unbundle "$TESTDIR/bundles/legacy-encoding.hg"
adding changesets
adding manifests
adding file changes
added 2 changesets with 2 changes to 1 files
$ hg co
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ $PYTHON << EOF
> f = open('latin-1', 'wb'); _ = f.write(b"latin-1 e' encoded: \xe9"); f.close()
> f = open('utf-8', 'wb'); _ = f.write(b"utf-8 e' encoded: \xc3\xa9"); f.close()
> f = open('latin-1-tag', 'wb'); _ = f.write(b"\xe9"); f.close()
> EOF
should fail with encoding error
$ echo "plain old ascii" > a
$ hg st
M a
? latin-1
? latin-1-tag
? utf-8
$ HGENCODING=ascii hg ci -l latin-1
abort: decoding near ' encoded: \xe9': 'utf8' codec can't decode byte 0xe9 in position 20: unexpected end of data! (esc)
[255]
these should work
$ echo "latin-1" > a
$ HGENCODING=latin-1 hg ci -l latin-1
$ echo "utf-8" > a
$ HGENCODING=utf-8 hg ci -l utf-8
hg log (ascii)
$ hg --encoding ascii log
commit: ca661e7520de
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: utf-8 e' encoded: ?
commit: 650c6f3d55dd
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: latin-1 e' encoded: ?
commit: 0e5b7e3f9c4a
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: koi8-r: ????? = u'\u0440\u0442\u0443\u0442\u044c'
commit: 1e78a93102a3
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e': ? = u'\xe9'
hg log (latin-1)
$ hg --encoding latin-1 log
commit: ca661e7520de
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: utf-8 e' encoded: \xe9 (esc)
commit: 650c6f3d55dd
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: latin-1 e' encoded: \xe9 (esc)
commit: 0e5b7e3f9c4a
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: koi8-r: \xd2\xd4\xd5\xd4\xd8 = u'\\u0440\\u0442\\u0443\\u0442\\u044c' (esc)
commit: 1e78a93102a3
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e': \xe9 = u'\\xe9' (esc)
hg log (utf-8)
$ hg --encoding utf-8 log
commit: ca661e7520de
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: utf-8 e' encoded: \xc3\xa9 (esc)
commit: 650c6f3d55dd
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: latin-1 e' encoded: \xc3\xa9 (esc)
commit: 0e5b7e3f9c4a
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: koi8-r: \xc3\x92\xc3\x94\xc3\x95\xc3\x94\xc3\x98 = u'\\u0440\\u0442\\u0443\\u0442\\u044c' (esc)
commit: 1e78a93102a3
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e': \xc3\xa9 = u'\\xe9' (esc)
hg log (utf-8)
$ HGENCODING=utf-8 hg log
commit: ca661e7520de
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: utf-8 e' encoded: \xc3\xa9 (esc)
commit: 650c6f3d55dd
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: latin-1 e' encoded: \xc3\xa9 (esc)
commit: 0e5b7e3f9c4a
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: koi8-r: \xc3\x92\xc3\x94\xc3\x95\xc3\x94\xc3\x98 = u'\\u0440\\u0442\\u0443\\u0442\\u044c' (esc)
commit: 1e78a93102a3
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e': \xc3\xa9 = u'\\xe9' (esc)
hg log (dolphin)
$ HGENCODING=dolphin hg log
abort: unknown encoding: dolphin
(please check your locale settings)
[255]
$ cp latin-1-tag .hg/branch
$ HGENCODING=latin-1 hg ci -m 'auto-promote legacy name'
$ cd ..
Test roundtrip encoding/decoding of utf8b for generated data
#if hypothesis
>>> from hypothesishelpers import *
>>> from edenscm.mercurial import encoding
>>> roundtrips(st.binary(), encoding.fromutf8b, encoding.toutf8b)
Round trip OK
#endif