Allow a test case to have an undefined language, since the detokenizer doesn't require a language to be passed in and, indeed, errors if a language is passed in for which there are no special rules (which seems dubious to me ...). Add test case TEST_GERMAN_NONASCII with an undefined language.

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4130 1f5c12ca-751b-0410-a591-d2e778427230
This commit is contained in:
bgottesman 2011-08-05 19:14:01 +00:00
parent d7752b44fc
commit c030dae094

View File

@ -98,6 +98,20 @@ EXP
$testCase->setExpectedToFail("A bug is causing this to be detokenized wrong.");
}
# A German test involving non-ASCII characters
# Note: We don't specify a language because the detokenizer errors if you pass in a language for which it has no special rules, of which German is an example.
&addDetokenizerTest("TEST_GERMAN_NONASCII", undef,
<<'TOK'
Ich hoffe , daß Sie schöne Ferien hatten .
Frau Präsidentin ! Frau Díez González und ich hatten einige Anfragen
TOK
,
<<'EXP'
Ich hoffe, daß Sie schöne Ferien hatten.
Frau Präsidentin! Frau Díez González und ich hatten einige Anfragen
EXP
);
######################################
# Now run those babies ...
######################################
@ -145,7 +159,7 @@ sub runDetokenizerTest {
close TRUTH;
&runTest($testCase->getName(), $testOutputDir, $tokenizedFile, sub {
return [$detokenizer, "-l", $testCase->getLanguage()];
return defined($testCase->getLanguage())? [$detokenizer, "-l", $testCase->getLanguage()] : [$detokenizer];
}, sub {
&verifyIdentical($testCase->getName(), $expectedFile, catfile($testOutputDir, "stdout.txt"))
}, 1, $testCase->getFailureExplanation());