Python encoding in python2 and python3

In python2, binary type is native string type, so function b and nativestr should be the same. In python3, unicode is native string and binary is encoded unicode type. Now the question comes, why do we encode unicode string in python3 by encode('latin-1') instead using utf-8?
This commit is contained in:
Mingjian Lu 2017-06-13 17:13:59 -04:00 committed by GitHub
parent 7547ab9acf
commit 38310a5ec3

View File

@ -85,7 +85,9 @@ if PY3:
return iter(d.values())
def u(s):
if isinstance(s, text_type):
return s
return s.decode('utf-8', 'replace')
def b(s):
if isinstance(s, binary_type):
@ -143,10 +145,14 @@ else:
return d.itervalues()
def u(s):
if isinstance(s, text_type):
return s
return s.decode('utf-8')
def b(s):
if isinstance(s, binary_type):
return s
return s.encode('utf-8', 'replace')
def nativestr(s):
if isinstance(s, binary_type):