mosesdecoder/scripts/tokenizer
Antoine Dusséaux d04bdc7440 Separate comma after a number end sentence
Separate "," after a number if it's the end of a sentence.

Example:

He is tall,
He was born in 1800,
He wants to go there in 2000.

He is tall ,
He was born in 1800 ,
He wants to go there in 2000 .
2016-07-31 14:10:07 +02:00
..
basic-protected-patterns makemteval and small change to tokenizer. /Tom Hoar and Tomas Fulajtar 2014-11-21 13:55:13 +00:00
deescape-special-chars-PTB.perl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
deescape-special-chars.perl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
delete-long-words.perl add script for acquis cleaning 2016-04-19 10:02:46 +01:00
detokenizer.perl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
escape-special-chars.perl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
lowercase.perl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
normalize-punctuation.perl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
pre_tokenize_cleaning.py Add license notices to scripts. 2015-05-29 18:30:26 +07:00
pre-tok-clean.perl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
pre-tokenizer.perl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
remove-non-printing-char.perl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
replace-unicode-punctuation.perl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
tokenizer_PTB.perl ga (mostly) behaves more like fr/it 2015-09-23 14:33:18 +01:00
tokenizer.perl Separate comma after a number end sentence 2016-07-31 14:10:07 +02:00