Commit Graph

2355 Commits

Author SHA1 Message Date
Ales Tamchyna
8aa4958af4 typo 2011-09-08 15:10:47 +02:00
Ales Tamchyna
68edbb15ae bugfix 2011-09-08 14:56:03 +02:00
Ales Tamchyna
a7dba4310b bugfixes 2011-09-08 14:51:39 +02:00
Ales Tamchyna
4fa848ba9a more verbose logging 2011-09-07 15:43:53 +02:00
Ales Tamchyna
59e6e558c2 checking regression test results, looping while tests fail 2011-09-07 12:19:39 +02:00
Ales Tamchyna
94fa872b6d return '2' if a test failed 2011-09-07 11:16:44 +02:00
Ales Tamchyna
6cdf685708 regression tests run 2011-09-06 18:33:59 +02:00
Ales Tamchyna
c588c05517 added an example config file, minor modifications in testing code 2011-09-06 15:08:48 +02:00
leven101
762c47d8c9 added checks for loopy data in dynamic suffix array
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4169 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-01 09:57:16 +00:00
nicolabertoldi
75edc2eddd change to print the corrett name of the features with InputScores
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4168 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-30 12:25:50 +00:00
bojar
ca1912961d first draft of cruise control for Moses
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4166 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-29 06:20:25 +00:00
hieuhoang1972
33ced5538a option to sort word alignment info, as suggested by arianna bisazza
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4165 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-26 02:38:48 +00:00
hieuhoang1972
5449839d75 option to sort word alignment info, as suggested by arianna bisazza
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4164 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-26 02:37:52 +00:00
heafield
56824c07e5 Should have made these return const * as well.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4163 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-24 16:44:59 +00:00
bhaddow
4d8f9a0716 Remove excessive debug
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4162 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-24 14:00:21 +00:00
mlegendr
3f0d83531f Part 3 of n-gram thing: added LanguageKenLM.h to public library headers
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4161 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-24 11:26:41 +00:00
heafield
6f391a7dbd Part 2 of Marc LEGENDRE's changes to expose n-gram length.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4160 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-24 10:45:41 +00:00
heafield
b3c06822ed Fix memory leak reported by Marc LEGENDRE. Also make the FFState for begin and null context const.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4158 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-24 10:15:29 +00:00
hieuhoang1972
3763b2466b run scorer regression test from any directory
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4155 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-22 07:56:46 +00:00
hieuhoang1972
1873030d24 forgot to add these files for regression tests
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4154 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-22 07:40:14 +00:00
machacekmatous
642e8dce95 Added evaluator to MERT directory. This tool computes a metric score for given candidate and reference files:
evaluator --sctype PER --reference ref.file --candidate cand.file

usage: evaluator [options] --reference ref1[,ref2[,ref3...]] --candidate cand1[,cand2[,cand3...]]
[--sctype|-s] the scorer type (default BLEU)
[--scconfig|-c] configuration string passed to scorer
        This is of the form NAME1:VAL1,NAME2:VAL2 etc
[--reference|-R] comma separated list of reference files
[--candidate|-C] comma separated list of candidate files
[--bootstrap|-b] number of booststraped samples (default 0 - no bootstraping)
[--rseed|-r] the random seed for bootstraping (defaults to system clock)
[--help|-h] print this message and exit


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4153 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-20 15:25:19 +00:00
machacekmatous
63fd490a51 Added CDER metric to use in MERT.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4152 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-18 21:35:16 +00:00
oliver-wilson
96417949c2 Keep track of the order at which the last ngram request succeeded and
use it to inform the next request.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4151 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-18 12:36:02 +00:00
bojar
998b86f639 addind a TODO list for anyone, esp. Matous Machacek
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4150 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-17 10:01:14 +00:00
bojar
89c100ea83 revamp of mert-moses.pl (got rid of 'triples', relying on moses' -show-weights)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4149 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-17 09:15:19 +00:00
bojar
6e23604e7c removing old comments
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4148 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-17 09:04:49 +00:00
bojar
42ccbcc995 merged updates up to r4132 inclusive
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4147 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-17 09:02:45 +00:00
machacekmatous
3ef02eb7e6 merged in TER Scorer from mert-other_metrics (at r4140)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4146 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-16 16:21:31 +00:00
heafield
6dae77c3eb Fix segfault withe trie on models without <unk>. Problem was that trie writes correct counts to
the binary file header, including <unk>.  But the vocabulary was sized based on the ARPA file 
count (excluding <unk>).  Then when the binary file was loaded, the vocabulary size was based on 
the count including <unk>.  Fix this by pre-padding vocabulary to the count including <unk>.  

Also, some minor cleanups: remove a debug message and change some always-true returns to void.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4145 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-16 12:57:21 +00:00
bgottesman
436a285f18 stop using 'subtest' because it doesn't work for everyone, e.g. Hieu reports it doesn't work on a Mac even with an up-to-date Test::Simple module
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4144 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-16 10:51:43 +00:00
chesio
22da5782f3 Option to use --eppex added to train-model.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4143 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-14 17:42:12 +00:00
chesio
9f8fc06a2b Integration of eppex into scripts Makefile (similarly to memscore, fail of build won't stop the compilation).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4142 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-14 11:48:50 +00:00
chesio
27bb28885e Eppex: config.h renamed to typedefs.h (preparing for project autoconf-iguration).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4141 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-14 10:40:16 +00:00
chesio
019574fc61 Added eppex - an alternative to extract component of phrase-extract.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4138 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-10 09:13:29 +00:00
hieuhoang1972
87216f55be rename & make executable
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4136 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-09 09:07:34 +00:00
bgottesman
0fe1c629da if we fail to make the output directory for a test, just abort the test, don't exit the whole script
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4135 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 18:26:57 +00:00
bgottesman
24f5bf6723 when detokenizing, remove whitespace between a pair of CJK (Chinese/Japanese/Korean) words
This gets the Chinese and Japanese tests working, so remove the failure expectation.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4134 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 15:30:54 +00:00
bgottesman
14587cdafc fix a detokenization bug that was preventing the removal of the whitespace following a contracted French or Italian article/pronoun (e.g. "l' immigration") when the contraction was the second-last word in the segment
remove the expectation of failure on the corresponding unit test


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4133 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 15:02:56 +00:00
rsennrich
79142d18e6 replace hard-coded path with variable
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4132 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 14:24:09 +00:00
bgottesman
9d9977bc6f add TODO tests for detokenization of Chinese and Japanese
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4131 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 13:21:05 +00:00
bgottesman
c030dae094 Allow a test case to have an undefined language, since the detokenizer doesn't require a language to be passed in and, indeed, errors if a language is passed in for which there are no special rules (which seems dubious to me ...). Add test case TEST_GERMAN_NONASCII with an undefined language.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4130 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 19:14:01 +00:00
theleopardess
d7752b44fc I tested check-in by adding a test line in moses/src/StaticData.cpp, producing a trivial moses revision 4122. Now I have removed that line. Everything ok but sorry for the confusion.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4129 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 18:57:09 +00:00
bgottesman
024bbe0bcc - factor out class DetokenizerTestCase
- create an array of all of the test cases before running any of them
- in the case of an expected failure, move the TODO block deeper, just around the validation of the results

I'm not 100% I like this change, I think it makes the code slightly more elegant but it also makes it longer.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4128 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 18:48:37 +00:00
bgottesman
d521287a3f move commas to after here-docs, to hopefully make test cases more readable; and remove unused import
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4125 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 16:37:49 +00:00
bgottesman
76c3ef4dba a few more detokenization tests, including a TODO one that exposes a bug
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4124 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 16:23:47 +00:00
theleopardess
f8a99e5d6d yanggao-softdep-v0
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4122 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 15:11:43 +00:00
bgottesman
eda0f4e370 An initial test suite for detokenizer.perl.
I realize this doesn't quite fit the paradigm if the existing moses test suite.  On the other hand, it's self-contained, easy to run, easy to add tests to (just follow the pattern in the section titled 'Definitions of individual test cases'), and uses an established Perl testing framework.  I don't think it will be infeasible to incorporate it into the existing test suite.

Usage:

run-test-detokenizer.t --results-dir <RESULTS-DIRECTORY>

where <RESULTS-DIRECTORY> is an empty existing directory where the output can be written


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4121 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 14:32:39 +00:00
hieuhoang1972
30ca534b86 faster scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4119 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 10:27:15 +00:00
hieuhoang1972
b4c79f721e regression test for scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4118 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 10:18:07 +00:00
hieuhoang1972
b618aadf8d regression test for scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4117 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 09:23:48 +00:00