Commit Graph

2538 Commits

Author SHA1 Message Date
heafield
b3c06822ed Fix memory leak reported by Marc LEGENDRE. Also make the FFState for begin and null context const.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4158 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-24 10:15:29 +00:00
hieuhoang1972
3763b2466b run scorer regression test from any directory
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4155 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-22 07:56:46 +00:00
hieuhoang1972
1873030d24 forgot to add these files for regression tests
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4154 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-22 07:40:14 +00:00
machacekmatous
642e8dce95 Added evaluator to MERT directory. This tool computes a metric score for given candidate and reference files:
evaluator --sctype PER --reference ref.file --candidate cand.file

usage: evaluator [options] --reference ref1[,ref2[,ref3...]] --candidate cand1[,cand2[,cand3...]]
[--sctype|-s] the scorer type (default BLEU)
[--scconfig|-c] configuration string passed to scorer
        This is of the form NAME1:VAL1,NAME2:VAL2 etc
[--reference|-R] comma separated list of reference files
[--candidate|-C] comma separated list of candidate files
[--bootstrap|-b] number of booststraped samples (default 0 - no bootstraping)
[--rseed|-r] the random seed for bootstraping (defaults to system clock)
[--help|-h] print this message and exit


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4153 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-20 15:25:19 +00:00
machacekmatous
63fd490a51 Added CDER metric to use in MERT.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4152 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-18 21:35:16 +00:00
oliver-wilson
96417949c2 Keep track of the order at which the last ngram request succeeded and
use it to inform the next request.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4151 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-18 12:36:02 +00:00
bojar
998b86f639 addind a TODO list for anyone, esp. Matous Machacek
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4150 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-17 10:01:14 +00:00
bojar
89c100ea83 revamp of mert-moses.pl (got rid of 'triples', relying on moses' -show-weights)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4149 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-17 09:15:19 +00:00
bojar
6e23604e7c removing old comments
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4148 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-17 09:04:49 +00:00
bojar
42ccbcc995 merged updates up to r4132 inclusive
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4147 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-17 09:02:45 +00:00
machacekmatous
3ef02eb7e6 merged in TER Scorer from mert-other_metrics (at r4140)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4146 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-16 16:21:31 +00:00
heafield
6dae77c3eb Fix segfault withe trie on models without <unk>. Problem was that trie writes correct counts to
the binary file header, including <unk>.  But the vocabulary was sized based on the ARPA file 
count (excluding <unk>).  Then when the binary file was loaded, the vocabulary size was based on 
the count including <unk>.  Fix this by pre-padding vocabulary to the count including <unk>.  

Also, some minor cleanups: remove a debug message and change some always-true returns to void.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4145 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-16 12:57:21 +00:00
bgottesman
436a285f18 stop using 'subtest' because it doesn't work for everyone, e.g. Hieu reports it doesn't work on a Mac even with an up-to-date Test::Simple module
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4144 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-16 10:51:43 +00:00
chesio
22da5782f3 Option to use --eppex added to train-model.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4143 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-14 17:42:12 +00:00
chesio
9f8fc06a2b Integration of eppex into scripts Makefile (similarly to memscore, fail of build won't stop the compilation).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4142 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-14 11:48:50 +00:00
chesio
27bb28885e Eppex: config.h renamed to typedefs.h (preparing for project autoconf-iguration).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4141 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-14 10:40:16 +00:00
chesio
019574fc61 Added eppex - an alternative to extract component of phrase-extract.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4138 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-10 09:13:29 +00:00
hieuhoang1972
87216f55be rename & make executable
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4136 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-09 09:07:34 +00:00
bgottesman
0fe1c629da if we fail to make the output directory for a test, just abort the test, don't exit the whole script
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4135 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 18:26:57 +00:00
bgottesman
24f5bf6723 when detokenizing, remove whitespace between a pair of CJK (Chinese/Japanese/Korean) words
This gets the Chinese and Japanese tests working, so remove the failure expectation.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4134 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 15:30:54 +00:00
bgottesman
14587cdafc fix a detokenization bug that was preventing the removal of the whitespace following a contracted French or Italian article/pronoun (e.g. "l' immigration") when the contraction was the second-last word in the segment
remove the expectation of failure on the corresponding unit test


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4133 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 15:02:56 +00:00
rsennrich
79142d18e6 replace hard-coded path with variable
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4132 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 14:24:09 +00:00
bgottesman
9d9977bc6f add TODO tests for detokenization of Chinese and Japanese
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4131 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 13:21:05 +00:00
bgottesman
c030dae094 Allow a test case to have an undefined language, since the detokenizer doesn't require a language to be passed in and, indeed, errors if a language is passed in for which there are no special rules (which seems dubious to me ...). Add test case TEST_GERMAN_NONASCII with an undefined language.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4130 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 19:14:01 +00:00
theleopardess
d7752b44fc I tested check-in by adding a test line in moses/src/StaticData.cpp, producing a trivial moses revision 4122. Now I have removed that line. Everything ok but sorry for the confusion.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4129 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 18:57:09 +00:00
bgottesman
024bbe0bcc - factor out class DetokenizerTestCase
- create an array of all of the test cases before running any of them
- in the case of an expected failure, move the TODO block deeper, just around the validation of the results

I'm not 100% I like this change, I think it makes the code slightly more elegant but it also makes it longer.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4128 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 18:48:37 +00:00
bgottesman
d521287a3f move commas to after here-docs, to hopefully make test cases more readable; and remove unused import
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4125 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 16:37:49 +00:00
bgottesman
76c3ef4dba a few more detokenization tests, including a TODO one that exposes a bug
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4124 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 16:23:47 +00:00
theleopardess
f8a99e5d6d yanggao-softdep-v0
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4122 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 15:11:43 +00:00
bgottesman
eda0f4e370 An initial test suite for detokenizer.perl.
I realize this doesn't quite fit the paradigm if the existing moses test suite.  On the other hand, it's self-contained, easy to run, easy to add tests to (just follow the pattern in the section titled 'Definitions of individual test cases'), and uses an established Perl testing framework.  I don't think it will be infeasible to incorporate it into the existing test suite.

Usage:

run-test-detokenizer.t --results-dir <RESULTS-DIRECTORY>

where <RESULTS-DIRECTORY> is an empty existing directory where the output can be written


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4121 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 14:32:39 +00:00
hieuhoang1972
30ca534b86 faster scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4119 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 10:27:15 +00:00
hieuhoang1972
b4c79f721e regression test for scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4118 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 10:18:07 +00:00
hieuhoang1972
b618aadf8d regression test for scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4117 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 09:23:48 +00:00
hieuhoang1972
b8a0b09206 regression test for scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4116 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 02:48:30 +00:00
bojar
779873a2a2 merged Philipp's updates up to r4106 inclusive
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4115 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 23:05:18 +00:00
bojar
7a301a7b5a negligible polishing
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4114 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 22:24:35 +00:00
hieuhoang1972
fc176801d6 regression test for score
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4112 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 09:15:43 +00:00
hieuhoang1972
e988361d62 regression test for score
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4111 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 08:11:40 +00:00
hieuhoang1972
cdbb850cc3 fix new scorer to output phrase pairs in same order as old scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4110 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 07:36:25 +00:00
hieuhoang1972
e7b97c1b1a vs build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4109 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 04:53:21 +00:00
heafield
61974ad75e Minor fixes. One for David Chiang who has files without initial newlines.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4108 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 19:46:19 +00:00
phkoehn
36db0ffe48 added pairwise ranked optimization (PRO) as proposd by [Hopkins&May,2011], just use switch --pairwise-ranked
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4106 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 17:00:17 +00:00
nicolabertoldi
579d8b0760 added few regression tests explicitly working with IRSTLM; modified few regression tests wrongly working with IRSTLM/SRILM; modified the required data archive (now version is 6);
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4105 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 09:19:48 +00:00
hieuhoang1972
49e56f35bb regression test for score
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4102 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 10:28:36 +00:00
hieuhoang1972
d45a29d9c7 data for score regression test
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4101 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 10:10:09 +00:00
hieuhoang1972
ed4367ceb0 data for score regression test
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4100 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 10:04:53 +00:00
hieuhoang1972
69fe991923 data for score regression test
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4099 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 09:53:25 +00:00
hieuhoang1972
1ae8c53a08 executable perl script
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4098 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 09:31:01 +00:00
hieuhoang1972
acb7e984de starting regression test for score program
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4097 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 09:27:59 +00:00
hieuhoang1972
65f7ffb783 delete debug message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4096 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:58:05 +00:00