mirror of
https://github.com/moses-smt/mosesdecoder.git
synced 2024-12-26 21:42:19 +03:00
277 lines
2.6 KiB
Plaintext
277 lines
2.6 KiB
Plaintext
|
#Anything in this file, followed by a period (and an upper-case word), does NOT indicate an end-of-sentence marker.
|
||
|
#Special cases are included for prefixes that ONLY appear before 0-9 numbers.
|
||
|
|
||
|
#any single upper case letter followed by a period is not a sentence ender (excluding I occasionally, but we leave it in)
|
||
|
#usually upper case letters are initials in a name
|
||
|
அ
|
||
|
ஆ
|
||
|
இ
|
||
|
ஈ
|
||
|
உ
|
||
|
ஊ
|
||
|
எ
|
||
|
ஏ
|
||
|
ஐ
|
||
|
ஒ
|
||
|
ஓ
|
||
|
ஔ
|
||
|
ஃ
|
||
|
க
|
||
|
கா
|
||
|
கி
|
||
|
கீ
|
||
|
கு
|
||
|
கூ
|
||
|
கெ
|
||
|
கே
|
||
|
கை
|
||
|
கொ
|
||
|
கோ
|
||
|
கௌ
|
||
|
க்
|
||
|
ச
|
||
|
சா
|
||
|
சி
|
||
|
சீ
|
||
|
சு
|
||
|
சூ
|
||
|
செ
|
||
|
சே
|
||
|
சை
|
||
|
சொ
|
||
|
சோ
|
||
|
சௌ
|
||
|
ச்
|
||
|
ட
|
||
|
டா
|
||
|
டி
|
||
|
டீ
|
||
|
டு
|
||
|
டூ
|
||
|
டெ
|
||
|
டே
|
||
|
டை
|
||
|
டொ
|
||
|
டோ
|
||
|
டௌ
|
||
|
ட்
|
||
|
த
|
||
|
தா
|
||
|
தி
|
||
|
தீ
|
||
|
து
|
||
|
தூ
|
||
|
தெ
|
||
|
தே
|
||
|
தை
|
||
|
தொ
|
||
|
தோ
|
||
|
தௌ
|
||
|
த்
|
||
|
ப
|
||
|
பா
|
||
|
பி
|
||
|
பீ
|
||
|
பு
|
||
|
பூ
|
||
|
பெ
|
||
|
பே
|
||
|
பை
|
||
|
பொ
|
||
|
போ
|
||
|
பௌ
|
||
|
ப்
|
||
|
ற
|
||
|
றா
|
||
|
றி
|
||
|
றீ
|
||
|
று
|
||
|
றூ
|
||
|
றெ
|
||
|
றே
|
||
|
றை
|
||
|
றொ
|
||
|
றோ
|
||
|
றௌ
|
||
|
ற்
|
||
|
ய
|
||
|
யா
|
||
|
யி
|
||
|
யீ
|
||
|
யு
|
||
|
யூ
|
||
|
யெ
|
||
|
யே
|
||
|
யை
|
||
|
யொ
|
||
|
யோ
|
||
|
யௌ
|
||
|
ய்
|
||
|
ர
|
||
|
ரா
|
||
|
ரி
|
||
|
ரீ
|
||
|
ரு
|
||
|
ரூ
|
||
|
ரெ
|
||
|
ரே
|
||
|
ரை
|
||
|
ரொ
|
||
|
ரோ
|
||
|
ரௌ
|
||
|
ர்
|
||
|
ல
|
||
|
லா
|
||
|
லி
|
||
|
லீ
|
||
|
லு
|
||
|
லூ
|
||
|
லெ
|
||
|
லே
|
||
|
லை
|
||
|
லொ
|
||
|
லோ
|
||
|
லௌ
|
||
|
ல்
|
||
|
வ
|
||
|
வா
|
||
|
வி
|
||
|
வீ
|
||
|
வு
|
||
|
வூ
|
||
|
வெ
|
||
|
வே
|
||
|
வை
|
||
|
வொ
|
||
|
வோ
|
||
|
வௌ
|
||
|
வ்
|
||
|
ள
|
||
|
ளா
|
||
|
ளி
|
||
|
ளீ
|
||
|
ளு
|
||
|
ளூ
|
||
|
ளெ
|
||
|
ளே
|
||
|
ளை
|
||
|
ளொ
|
||
|
ளோ
|
||
|
ளௌ
|
||
|
ள்
|
||
|
ழ
|
||
|
ழா
|
||
|
ழி
|
||
|
ழீ
|
||
|
ழு
|
||
|
ழூ
|
||
|
ழெ
|
||
|
ழே
|
||
|
ழை
|
||
|
ழொ
|
||
|
ழோ
|
||
|
ழௌ
|
||
|
ழ்
|
||
|
ங
|
||
|
ஙா
|
||
|
ஙி
|
||
|
ஙீ
|
||
|
ஙு
|
||
|
ஙூ
|
||
|
ஙெ
|
||
|
ஙே
|
||
|
ஙை
|
||
|
ஙொ
|
||
|
ஙோ
|
||
|
ஙௌ
|
||
|
ங்
|
||
|
ஞ
|
||
|
ஞா
|
||
|
ஞி
|
||
|
ஞீ
|
||
|
ஞு
|
||
|
ஞூ
|
||
|
ஞெ
|
||
|
ஞே
|
||
|
ஞை
|
||
|
ஞொ
|
||
|
ஞோ
|
||
|
ஞௌ
|
||
|
ஞ்
|
||
|
ண
|
||
|
ணா
|
||
|
ணி
|
||
|
ணீ
|
||
|
ணு
|
||
|
ணூ
|
||
|
ணெ
|
||
|
ணே
|
||
|
ணை
|
||
|
ணொ
|
||
|
ணோ
|
||
|
ணௌ
|
||
|
ண்
|
||
|
ந
|
||
|
நா
|
||
|
நி
|
||
|
நீ
|
||
|
நு
|
||
|
நூ
|
||
|
நெ
|
||
|
நே
|
||
|
நை
|
||
|
நொ
|
||
|
நோ
|
||
|
நௌ
|
||
|
ந்
|
||
|
ம
|
||
|
மா
|
||
|
மி
|
||
|
மீ
|
||
|
மு
|
||
|
மூ
|
||
|
மெ
|
||
|
மே
|
||
|
மை
|
||
|
மொ
|
||
|
மோ
|
||
|
மௌ
|
||
|
ம்
|
||
|
ன
|
||
|
னா
|
||
|
னி
|
||
|
னீ
|
||
|
னு
|
||
|
னூ
|
||
|
னெ
|
||
|
னே
|
||
|
னை
|
||
|
னொ
|
||
|
னோ
|
||
|
னௌ
|
||
|
ன்
|
||
|
|
||
|
|
||
|
#List of titles. These are often followed by upper-case names, but do not indicate sentence breaks
|
||
|
திரு
|
||
|
திருமதி
|
||
|
வண
|
||
|
கௌரவ
|
||
|
|
||
|
|
||
|
#misc - odd period-ending items that NEVER indicate breaks (p.m. does NOT fall into this category - it sometimes ends a sentence)
|
||
|
உ.ம்
|
||
|
#கா.ம்
|
||
|
#எ.ம்
|
||
|
|
||
|
|
||
|
#Numbers only. These should only induce breaks when followed by a numeric sequence
|
||
|
# add NUMERIC_ONLY after the word for this function
|
||
|
#This case is mostly for the english "No." which can either be a sentence of its own, or
|
||
|
#if followed by a number, a non-breaking prefix
|
||
|
No #NUMERIC_ONLY#
|
||
|
Nos
|
||
|
Art #NUMERIC_ONLY#
|
||
|
Nr
|
||
|
pp #NUMERIC_ONLY#
|