Taku Kudo
|
9ca65fa9b6
|
add optional NFKD support.
|
2022-05-29 11:43:42 +09:00 |
|
Taku Kudo
|
cbfc6b3c2c
|
updated *.tsv file.
|
2021-06-18 01:06:29 +09:00 |
|
Sarubi
|
c970dedd8f
|
Removed codes where Zero Width Joiner replaced with whitespace.
|
2021-02-23 20:47:25 +05:30 |
|
Taku Kudo
|
329383b455
|
Initial release of 0.19. Merged internal sentencepiece.
|
2020-05-08 01:06:50 +09:00 |
|
Taku Kudo
|
18c337f32d
|
remove control characters in the default nmt_* normalizers
|
2019-01-10 17:16:53 +09:00 |
|
Taku Kudo
|
e4a93f88c1
|
Do not parse deprecated proto fileds
|
2019-01-08 19:25:11 +09:00 |
|
Taku Kudo
|
7b19d68be0
|
use builtin protobuf-lite package in third_party
|
2019-01-08 11:40:08 +09:00 |
|
Taku Kudo
|
465a419250
|
pushed new nfkc_cf.tsv
|
2018-10-29 01:11:45 +09:00 |
|
Taku Kudo
|
573586854e
|
Added normalization with Unicode case folding
|
2018-06-29 15:17:18 +09:00 |
|
Taku Kudo
|
cff66162c3
|
Remove empty lines from example data
|
2018-06-20 17:23:18 +09:00 |
|
Taku Kudo
|
a65ca0d829
|
Uses NMT_NFKC rule by default.
|
2018-06-10 01:15:34 +09:00 |
|
Taku Kudo
|
4e3bcf1373
|
Updated normalizer
|
2018-06-04 20:32:37 +09:00 |
|
Taku Kudo
|
2928ce5307
|
Initialize repository
|
2017-03-07 19:43:50 +09:00 |
|