Commit Graph

737 Commits

Author SHA1 Message Date
phkoehn
7334d49191 minor experiment.perl fixes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3668 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-27 12:42:34 +00:00
pjwilliams
3ca16120a2 Add --MaxScope option to extract-rules (Hopkins and Langmead, 2010)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3661 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:55:57 +00:00
bojar
c0e0bc62c6 fixed a stupid bug from last commit
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3660 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:23:31 +00:00
bojar
878c7100de accept binarized ttables as well
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3659 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:22:01 +00:00
bojar
8cfc403fec default location of new mert
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3658 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:21:38 +00:00
phkoehn
c8ae94e426 training for global lexicon model
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3655 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-25 16:24:59 +00:00
chardmeier
ecf4b0d368 Check for right boost version in memscore.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3640 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-22 14:36:43 +00:00
phkoehn
ace33d16dd bug fix
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3636 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-22 07:30:52 +00:00
phkoehn
3b880bbdda added biconcor to make
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3634 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-21 10:04:38 +00:00
phkoehn
85a5a13e4c improvements to web analysis, fixes to syntax wrappers
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3633 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-21 09:49:27 +00:00
bhaddow
88eaf49c5e remove detokeniser
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3632 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-21 08:42:28 +00:00
sarst
0594b13c61 Added nonbreaking_prefix.sv for Swedish
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3630 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-19 12:45:49 +00:00
bhaddow
2dc951b062 More informative error messages
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3625 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-15 09:00:18 +00:00
rafpayen
a1ab166692 reset file handle between opens, so as to have an error if no file is given
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3623 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-14 18:23:05 +00:00
hieuhoang1972
e5edb4b971 delete duplicate detokenizer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3622 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-13 16:39:46 +00:00
hieuhoang1972
08739d8f49 add from josh's script
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3621 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-13 10:14:13 +00:00
hieuhoang1972
eedef63277 keep perl scripts with Unix line endings
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3612 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-11 11:32:27 +00:00
hieuhoang1972
105e83df82 beautify
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3610 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-07 19:08:44 +00:00
suzyh
d071296cde Fix to reuse-weights.perl to copy weights containing an exponent
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3590 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-29 04:04:12 +00:00
rsennrich
7929e4624e more informative error message when hierarchical phrase extraction fails.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3550 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-22 12:56:11 +00:00
rosasjolu
8746482d04 Change data files location
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3549 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-22 11:03:36 +00:00
rosasjolu
16302a45a8 git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3548 1f5c12ca-751b-0410-a591-d2e778427230 2010-09-22 11:02:29 +00:00
rosasjolu
d2fd75ac49 Change data files location
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3547 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-22 10:27:07 +00:00
rosasjolu
ad62e27c90 git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3546 1f5c12ca-751b-0410-a591-d2e778427230 2010-09-22 10:24:52 +00:00
hieuhoang1972
ee842f578c delete data files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3545 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-22 10:05:11 +00:00
phkoehn
f34b37bad3 added hierarchical alignment view to web analysis tool
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3514 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-17 13:28:04 +00:00
hieuhoang1972
a582d483cc add lowercase script
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3475 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-16 07:32:32 +00:00
phkoehn
fb8b0eb180 new prefix files for tokenizer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3467 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-15 16:06:04 +00:00
rosasjolu
128a885406 git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3446 1f5c12ca-751b-0410-a591-d2e778427230 2010-09-14 19:27:15 +00:00
suzyh
fa4eca6ccb Added loop to check_if_crashed in EMS experiment.perl to wait in case the .STDERR file is slow in appearing after the step has completed. Reinstated --old-sge and --filterfile command-line arguments to mert-moses.pl.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3420 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-09 11:40:40 +00:00
bhaddow
fd7997dbf5 Fix for mert script from Yu Chen, to make sure it reads the correct
feature order from the nbest list.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3419 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-08 08:17:05 +00:00
hieuhoang1972
e53aeb903c need non-empty arg in mert-moses.pl otherwise it crashes. The wonders of a non-typechecked, arg checking scripting language
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3417 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-04 00:16:26 +00:00
phkoehn
4a85fd95ce srilm as setting in interpolation
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3416 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-03 12:57:16 +00:00
hieuhoang1972
51b99ede7a delete old qsub args
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3415 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-02 20:42:55 +00:00
bhaddow
12269b062c Only add queue-flags if non-empty.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3414 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-02 09:01:39 +00:00
bhaddow
7efec1a087 Set default max mert iterations to 25.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3413 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-01 19:57:55 +00:00
bhaddow
5f2f345165 max iterations option
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3412 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-01 17:02:12 +00:00
bgottesman
e409b6827c add --max-word-length option to cleaning script, with default value 1000; any segment with a word (or factor) exceeding this length in chars is discarded; motivated by symal.cpp, which has its own such parameter (hardcoded to 1000) and crashes if it encounters a word that exceeds it
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3410 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-23 16:35:14 +00:00
hieuhoang1972
083a9af215 delete alignment info for terminals
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3405 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-13 10:03:13 +00:00
rafpayen
b431f951c5 git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3404 1f5c12ca-751b-0410-a591-d2e778427230 2010-08-13 09:58:17 +00:00
hieuhoang1972
382799dd38 delete win32 make. out-of-date. find a better way of doing this
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3400 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-10 14:41:49 +00:00
hieuhoang1972
9600b6473c delete win32 make. out-of-date. find a better way of doing this
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3399 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-10 14:41:28 +00:00
bhaddow
08a8480136 rename mert-moses-new.pl to mert-moses.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3398 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-10 14:29:07 +00:00
bhaddow
321f528ff5 remove zmert and cmert
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3397 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-10 14:28:03 +00:00
bhaddow
904133fcb7 Merge in the multiple models branch. These changes allow the moses server
to support multiple translation, language and generation models within the
same process. The main design change is the introduction of a TranslationSystem
object to manage the models, which have been moved out of StaticData.
The changes should have no effect on existing systems.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3394 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-10 13:12:00 +00:00
bhaddow
d31b030bc5 Write correct ttable type when binarising a phrase table
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3392 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-09 12:28:33 +00:00
bhaddow
f2660e8d41 Fix glue grammar generation for new ttable format
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3386 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-06 14:45:37 +00:00
hieuhoang1972
7e6b3766dd visual studio
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3385 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-03 15:56:41 +00:00
bhaddow
faf65dfcd2 Remove unused options.
Merge in some changes from mert-moses.perl


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3384 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-03 15:54:31 +00:00
hieuhoang1972
579253d3cd add lowercaser
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3380 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-02 14:05:23 +00:00
rafpayen
2ef133e02b add empty fields in glue grammar to accomodate the new phrase table format
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3378 1f5c12ca-751b-0410-a591-d2e778427230
2010-07-30 15:55:14 +00:00
nicolabertoldi
621428de44 improved description of configuration file for [ttable-file] parameter
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3376 1f5c12ca-751b-0410-a591-d2e778427230
2010-07-30 09:10:17 +00:00
hieuhoang1972
8adef921ed new format for consolidate-direct
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3374 1f5c12ca-751b-0410-a591-d2e778427230
2010-07-29 23:20:37 +00:00
hieuhoang1972
0ee6d75566 bug in Good turing
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3372 1f5c12ca-751b-0410-a591-d2e778427230
2010-07-28 22:49:37 +00:00
hieuhoang1972
340ebbd333 bug in Good turing
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3370 1f5c12ca-751b-0410-a591-d2e778427230
2010-07-28 21:52:32 +00:00
hieuhoang1972
ae9779dd7f separate PhraseAlignment class into separate file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3369 1f5c12ca-751b-0410-a591-d2e778427230
2010-07-28 21:28:14 +00:00
hieuhoang1972
3d9d756055 alignment info, new format
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3363 1f5c12ca-751b-0410-a591-d2e778427230
2010-07-27 11:04:03 +00:00
rafpayen
b9e74aab90 change phrase-word-alignment to boolean flag instead of string
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3362 1f5c12ca-751b-0410-a591-d2e778427230
2010-07-22 13:08:46 +00:00
hieuhoang1972
881117d9f5 alignment info in pt
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3361 1f5c12ca-751b-0410-a591-d2e778427230
2010-07-18 19:49:08 +00:00
hieuhoang1972
31930eb6fc alignment info in pt
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3358 1f5c12ca-751b-0410-a591-d2e778427230
2010-07-17 22:29:06 +00:00
pjwilliams
fab2e96d2f In extract-rules, if the source or target syntax contains an unsupported
escape sequence (anything other than "<", ">", "&", "&apos",
and "&quot") then write a warning message and skip the sentence pair
(instead of asserting).


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3350 1f5c12ca-751b-0410-a591-d2e778427230
2010-06-29 10:41:42 +00:00
mphi
1f6e9b488b the script now calculates the p-value and confidence intervals not only using BLEU, but also the NIST score;
improved confidence interval representation (avg+-stddev);

fixed bugs



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3345 1f5c12ca-751b-0410-a591-d2e778427230
2010-06-22 20:17:42 +00:00
phkoehn
4e0bc582f6 minor improvements: binarizing rule tables in filter script, multiple reference translation in ems analysis
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3284 1f5c12ca-751b-0410-a591-d2e778427230
2010-05-28 22:19:58 +00:00
pjwilliams
05eb33d5ac In train-model.perl, write an 'unknown-lhs' line out to the config file if
an unknown word label file has been generated.  Also, disable this option
by default since it can greatly increase number of hypotheses generated
and hasn't been shown to help translation yet.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3277 1f5c12ca-751b-0410-a591-d2e778427230
2010-05-27 15:11:16 +00:00
pjwilliams
7d2d79022a Remove temporary file from scripts/released-files.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3267 1f5c12ca-751b-0410-a591-d2e778427230
2010-05-24 15:25:32 +00:00
phkoehn
b271862d7c various updates, mostly related to experiment.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3262 1f5c12ca-751b-0410-a591-d2e778427230
2010-05-18 17:39:16 +00:00
phkoehn
c15fc6f104 minor bug fixes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3247 1f5c12ca-751b-0410-a591-d2e778427230
2010-05-10 22:41:54 +00:00
phkoehn
524b1b12d2 added info for input phrase coverage
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3245 1f5c12ca-751b-0410-a591-d2e778427230
2010-05-10 03:05:35 +00:00
phkoehn
883d12d482 added info for input word coverage to analysis + fixes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3244 1f5c12ca-751b-0410-a591-d2e778427230
2010-05-10 00:19:40 +00:00
bojar
dcb8aafca7 rename train-factored-....perl to train-model.perl also in the Makefile and releasing
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3242 1f5c12ca-751b-0410-a591-d2e778427230
2010-05-07 23:17:55 +00:00
phkoehn
447dccfc59 more analysis in experiment.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3234 1f5c12ca-751b-0410-a591-d2e778427230
2010-05-07 11:28:55 +00:00
phkoehn
45ecfa72d2 minor changes to experiment.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3225 1f5c12ca-751b-0410-a591-d2e778427230
2010-05-05 16:06:34 +00:00
phkoehn
2ed6804f12 official release of experiment.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3224 1f5c12ca-751b-0410-a591-d2e778427230
2010-05-04 23:04:10 +00:00
bojar
4cc43c61f2 without parallelizer, -nbest-list must not be quoted
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3214 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-29 20:25:36 +00:00
bojar
21de1e121f Merge branch 'master' into moses-svn
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3192 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-21 14:48:32 +00:00
bojar
db8e3357f5 Merge branch 'zmert' into moses-svn
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3152 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-20 22:03:04 +00:00
bojar
aa6043a556 mert-moses must quote -n-best-list *if* passed through parallelizer (and must not otherwise)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3151 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-20 17:02:58 +00:00
bojar
8ea058bfc7 fixed handling of the lmodel section; has only 3 ints
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3150 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-20 17:02:41 +00:00
bojar
a7677e7fa2 releasing zmert-moses as well
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3145 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-20 11:11:49 +00:00
bojar
f36f347014 safer everything: tempdir, open, execution
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3144 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-20 11:10:31 +00:00
pjwilliams
2edfc16912 Merge remaining script support for tree-based models from mt3_chart.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3137 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-16 09:45:51 +00:00
hieuhoang1972
a2233d0f8d xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3136 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-14 16:53:39 +00:00
pjwilliams
264c9150d1 Use external consolidate' program in train-factored-phrase-model.perl.'
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3135 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-14 15:51:20 +00:00
pjwilliams
5faaedc0df Copy in consolidate,' consolidate-direct,' and the new version of
`score' from mt3_chart.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3134 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-14 15:50:17 +00:00
leven101
929bcf25fa added traing/lexical-reordering subdir to makefile
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3133 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-13 17:18:46 +00:00
hieuhoang1972
06ee9a3be3 vs
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3132 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-13 16:50:44 +00:00
pjwilliams
53cb08efca Use a generic version of the SAFE_GETLINE macro in scripts/phrase-extract
instead of defining one per source file.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3131 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-13 16:29:55 +00:00
hieuhoang1972
0440dfe079 xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3130 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-13 16:13:56 +00:00
pjwilliams
580acce9e2 Integrate rule extraction code from mt3_chart. There are now two extract
programs: `extract' for the phrase-based model and `extract-rules' for
tree-based models.  They could be combined into a single program, but
they're probably sufficiently different that it isn't worthwhile.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3129 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-13 15:34:39 +00:00
pjwilliams
51ae927ede Start merging in rule extraction code from mt3_chart branch.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3126 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-12 15:22:50 +00:00
pjwilliams
9c2536417f Remove file limit option for phrase extraction.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3122 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-12 11:56:54 +00:00
pjwilliams
99f1c92edb Remove redundant --ZipFiles option from extract.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3120 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-12 10:53:08 +00:00
pjwilliams
4c6c4b71cf Remove redundant --ProperConditioning option from extract.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3118 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-12 10:41:32 +00:00
bojar
7172d05a43 fixed a bug, too eager check for preprocessing type; not needed in some cases
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3111 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 23:21:13 +00:00
bojar
82d6cc714e use qruncmd to parallelize srunblocks
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3110 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 23:20:59 +00:00
bojar
c5f44a2abf better verbosity level for srunblocks: emit some (most importantly fatal) msgs
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3109 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 23:20:45 +00:00
bojar
390ee866d8 require TMT_ROOT only if TectoMT will be actually needed
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3108 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 23:20:30 +00:00
bojar
9fe574f6ed fixed weights for SemPOS_BLEU metric
usage: MERTFLAGS="--semposbleu-weights <sempos_weight>:<bleu_weight>"
e.g. --semposbleu-weights 2:1 to increase the weight of SemPOS


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3107 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:47:22 +00:00
bojar
26d77d15b2 added option --semposbleu-weights to specify weight of SemPOS and BLEU in SemPOS_BLEU metric
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3106 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:47:04 +00:00
bojar
c954a507c6 fixed error message in zmert-moses.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3105 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:46:49 +00:00
bojar
8e2e4eeecc bug fixed
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3104 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:46:34 +00:00
bojar
b8a6048e81 ZMERTSEMPOSSOURCE=factors working for SemPOS_BLEU
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3103 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:46:17 +00:00
bojar
866776810b typo fixed
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3102 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:45:59 +00:00
bojar
b87370aec2 few bugs fixed in ZMERTSEMPOSSOURCE=factors part
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3101 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:45:43 +00:00
bojar
19184783a9 fixed bug in --mert-verbose parameter
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3100 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:45:28 +00:00
bojar
9f3d2f427d fixed nbest-list conversion
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3098 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:44:41 +00:00
bojar
2fb22064a3 bug in passing nbest-lists to zmert
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3097 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:44:21 +00:00
bojar
cd64a02344 fixed bug - missing loop to copy one file to another
Conflicts:

	scripts/training/zmert-moses.pl


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3096 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:43:55 +00:00
bojar
b276bacb72 bug in foreach
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3095 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:43:27 +00:00
bojar
94eb8a5c1b added default parameters for basic metrics
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3094 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:43:10 +00:00
bojar
2604a890d7 zmert training with new metric SemPOS_BLEU - linear combination of SemPOS and BLEU
SemPOS_BLEU requires new TMT block ForSemPOSBLEUMetric.pm in TectoMT. Please,
update your TMT.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3093 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:42:49 +00:00
bojar
5b1541b96d Fixed order of feature scores passed to zmert. Evaluation using BLEU works.
Zmert uses different order of features than Moses. It is necessary to reorder
them when passing to Zmert in nbest-lists.
Previous versions used wrong copy of moses.ini, which linked to phrase tables
that were already filtered for tuning. Thus, phrase tables for evaluation were
missing a lot of phrases and the reported scores were too low.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3092 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:42:20 +00:00
bojar
5a1c308673 Setting number of zmert iterations back to unlimited in zmert config.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3091 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:41:55 +00:00
bojar
671e7e2e4f Default size of nbest-list in zmert-moses.pl set back to 100.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3090 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:41:31 +00:00
bojar
1ce9a15483 Bug fixes im zmert-moses.pl and zmert.jar. Zmert works.
Now it is possible to lauch Zmert with SemPOS metric.
It is possible to select a smaller model for McD parser by uncommenting line
with pdt20_train_autTag_golden_latin2_pruned_0.10.model in file zmert.tmt-scen.
Sentences are then analyzed faster (if you use TectoMT to get SemPOS tags).


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3089 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:41:11 +00:00
bojar
1b28cde2e8 small fixes of paths
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3088 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:40:29 +00:00
bojar
d47b82369f some small code modifications in zmert-moses.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3087 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:39:47 +00:00
bojar
d34af9b769 updated zmert-moses.pl - zmert with sempos still not running
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3086 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:39:31 +00:00
bojar
49bb6b8882 Updated version of zmert training - still not finished SemPOS factor loading via TMT
zmert-moses.pl - launches zmert training
zmert-decoder.pl - an is launched by zmert after each training iteration to compute scores
                   with updated lambdas
zmert.tmt-scen - TMT scenario to extract SemPOS factor for sentence without any factors


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3085 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:39:14 +00:00
bojar
90fecff3aa New mert training (Zmert) for Moses\n\nZmert jar includes SemPOS metric extension.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3084 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:38:53 +00:00
pjwilliams
b3f6e211fd Fix mistakes in previous commit (oh, and revert to my own svn username
to prevent my shoddy check-ins from further sullying Hieu's good name).


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3083 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 14:02:58 +00:00
hieuhoang1972
c6d20e1f9f Update the training scripts to support the new format parameter for
'ttable-file'


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3082 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 11:37:43 +00:00
sarst
7d3a22c8a7 Bugfixes for the new lexical reordering. Running without a reordering model, and with the reordering flag specyfying distance now works. Formatting of the extract.o file is now correct with 'f'-reordering models.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3054 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-06 12:08:04 +00:00
bojar
f703de9377 adding a script by Pranava Swaroop for Bing translation
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3018 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-02 13:46:46 +00:00
bojar
1883ea180c unescaping chars that google escapes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3017 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-02 13:46:31 +00:00
bhaddow
bd9f392875 Fix for training with non-lexicalised reordering
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3016 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-01 14:05:58 +00:00
sarst
943275e331 Set the default debug mode to 0 in train-factored-phrase-model.perl (which was the case before merging the hierarchical reordering branch to trunk)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3009 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-30 12:09:19 +00:00
bhaddow
9573147e36 Fix --reordering-table option
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3004 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-26 09:32:15 +00:00
bhaddow
8060024cc1 Can now specify reordering table when executing step 9
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2997 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-23 18:55:48 +00:00
bhaddow
a9920a68e1 remove extra . from ini
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2991 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-20 21:18:55 +00:00
bhaddow
4b6d3dddd2 fix error in merge
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2990 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-19 18:35:48 +00:00
bhaddow
795224736b Merge revisions 2670-2988 from track. Passes all regression except lexicalised
reordering


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2989 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-19 17:52:51 +00:00
bgottesman
5a3a6bd3b0 set utf8 mode on the input and output files, instead of on stdin and stdout, which are not used. This allows case variants of non-ASCII characters to be recognized correctly
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2987 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-18 19:13:05 +00:00
bgottesman
872d50dd9b Fix a line that extracts the source FACTORLIST from a FACTORMAP (in the terms of the Moses manual) so that it succeeds even if the target FACTORLIST contains commas, i.e. contains multiple factors
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2965 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-10 18:03:53 +00:00
bojar
d7335c0de1 avoid reducing corpus file if it already contains the needed factors
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2945 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-24 16:14:07 +00:00
sarst
435539d4ed Added an option to choose exactly which steps to run, not only the first and the last (since we sometimes need to do step 5 and 7 to retrain lexical reordering)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2917 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-19 18:03:44 +00:00
sarst
b95cc2f556 Added the check from word-based models of the alignment points in the adjacent corners, to the more complex models.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2916 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-19 15:15:24 +00:00
sarst
bdfe47f99b added option to skip phrase-scoring, when we want to train new reordering models on old trainings
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2914 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-18 21:16:00 +00:00
sarst
03e24c9856 changed the maximum model config, to always print all models
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2912 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-18 13:45:15 +00:00
sarst
a86470c974 lexical reordering models: it is now possible to add the maximal reordering orientations to the extract file, and the collapsing information is no longer part of the filename. Also removed some old variables, that are not used anymore.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2911 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-18 13:07:32 +00:00
hieuhoang1972
ad3b032ebf roll back
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2910 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-18 11:00:40 +00:00
chardmeier
b10d8eb469 Added lex reo scorer to released-files and removed extra ||| from reordering table.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2894 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-12 16:23:32 +00:00
sarst
c65945b531 Cleaned up lescial reordering scoring, and sent vectors as references instead of copying them. Fixed bugs in extract: it used to choose the wrong orientation at end of sentences, and the hierarchical model typ is no longer dependent on the phrase-based model type.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2892 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-12 13:46:33 +00:00
sarst
e812b2a8e3 Bugfixes: printing the correct phrases in the table, and fixed misspelling of monotonicity.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2885 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-11 09:09:44 +00:00
sarst
92368ba490 Rewrote the lexical reordering model scoring in C++. Adapted train-factored-phrase-model.perl to that change. Minor fixes in other places, for compatibility
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2884 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-10 17:19:06 +00:00
mphi
9e8352a041 modified the implementation, removing unnecessary repetition, thus making the whole process approximately fifty times faster
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2866 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-07 09:11:09 +00:00
mphi
05e21dc5e2 fixed compatibility of the --final-alignment-model 2 switch and mgiza (ibm2 is done in a single thread, hence no *.part* files)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2859 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-04 16:12:57 +00:00
bojar
ff05e5a1b5 list frequent mismatched tokenizations first
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2852 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 16:37:08 +00:00
bojar
9b10946f10 fixed regexes to read current -osg format
verbose at bad lines


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2850 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:35:21 +00:00
bojar
8891069ed8 safer extraction of job id
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2849 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:24:08 +00:00
bojar
a9c2069720 switching to bash, avoiding csh's >& redirection (not accepted by dash)
making --robust take the maximum number of restarts


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2848 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:23:53 +00:00
bojar
011840bf65 adding --force-factored-filenames to avoid problems with eager removal of '0-0'
Conflicts:

	scripts/training/train-factored-phrase-model.perl


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2847 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:23:35 +00:00
bojar
dbfe610546 uppercasing first letter even if after punct
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2846 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:23:20 +00:00
bojar
594e5e8acd adding a handy script for suspicious tokenization
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2845 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:23:06 +00:00
pasmargo
275c06d9e7 Kneser-Ney and modified Kneser-Ney smoothing implementation.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2837 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-02 18:14:01 +00:00
pasmargo
63bdf3b602 reverted to previous version due to a couple of mistakes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2836 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-02 16:51:14 +00:00
pasmargo
71b15ae0d4 Kneser-Ney and modified Kneser-Ney smoothing implementation.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2835 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-02 16:28:56 +00:00
naditomeh
242d6c6ddd word-based, phrase-based and hierarchical reordering is implemented in the training
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2823 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-31 23:56:45 +00:00
sanmarf
c37f2dc14e 4th MT marathon - lexical decomposition
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2805 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-30 10:11:35 +00:00
sarst
0785c09d03 bugfix for fatored-training config
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2804 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-30 09:34:55 +00:00
sarst
f2a5678541 added new file hierarchical.h
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2803 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-30 09:16:27 +00:00
sarst
6a13b8d186 bugfixes in train-factored-phrase-model*.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2801 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-30 00:09:45 +00:00
sarst
bf70dd4767 subimtted working scripts for hierarchical training (msd)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2796 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 22:38:18 +00:00
bojar
55e3ee4a30 just setting the executable bit
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2795 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 19:49:37 +00:00
bojar
2097e45edd a handy script for calculating out-of-vocabulary rate of n-grams
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2794 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 19:48:29 +00:00
naditomeh
ad3b0760b2 adding extract.cpp
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2770 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 13:59:04 +00:00
naditomeh
03de8a99d8 adding extract.cpp
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2769 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 13:41:34 +00:00
sarst
4eec020d5b bugfixes to train-factored-phrase-model.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2764 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 12:11:10 +00:00
sarst
a9ef19edf0 updated train-factored-phrase-model.perl to work with the new hierarchical reordering framework
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2759 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 11:29:39 +00:00
bojar
9f784c6bf8 a handy script to get many translations from Google (can continue interrupted
sessions)


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2744 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 01:48:13 +00:00
bojar
ed18df8dc7 allow env.var to override BINDIR and TARGETDIR
exclude memscore from the released things if failed to compile


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2740 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 00:51:28 +00:00
phkoehn
4d814e53d2 del
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2728 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-28 17:38:26 +00:00
phkoehn
f1f395e05d added web interface, organized files in sub directories
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2724 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-28 17:24:39 +00:00
hieuhoang1972
53e54def0b indent
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2723 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-28 17:17:19 +00:00
hieuhoang1972
f5ebdbcec8 xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2721 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-28 16:43:36 +00:00
phkoehn
244e334b3d bug fix
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2690 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-27 14:55:17 +00:00
bojar
0889b9efff renaming .pl -> .perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2674 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-26 17:23:41 +00:00
bojar
0e26f91865 don't organize to stacks by default, accept --organize-to-stacks
read from stdin as well


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2673 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-26 17:20:28 +00:00
bojar
536c7bdbcc commiting a script by Loic Barrault to display moses search graph
(-output-search-graph) using graphviz dot


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2672 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-26 17:01:12 +00:00
phkoehn
def35604af initial release of experiment.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2669 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-25 17:38:53 +00:00
chardmeier
6317633148 Added memscore phrase scorer.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2653 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-18 14:52:34 +00:00
nicolabertoldi
3ad833d136 program to compute countings for phrase pairs
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2647 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-08 17:16:37 +00:00
nicolabertoldi
34d9feccc8 now it is possible to perform mert on a subset of features
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2646 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-08 15:56:45 +00:00
mphi
850e54f17d added a switch to the training script, which allows using different word alignment models: --final-alignment-model X, where X is either 1/2/3/4/5 (for the ibm models 1 to 5) or hmm; the latter is equivalent to using --hmm-align.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2644 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-08 12:00:28 +00:00
bhaddow
d8864fa6ad support for mgiza
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2640 1f5c12ca-751b-0410-a591-d2e778427230
2009-12-15 17:36:06 +00:00
eisele
12dda84589 test, please ignore
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2598 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-23 15:45:35 +00:00
nicolabertoldi
5ad52827e3 minor change
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2576 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-09 07:47:14 +00:00
nicolabertoldi
427d421cf9 small change to be compliant with the previous change (2571->2572)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2573 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-07 18:09:20 +00:00
nicolabertoldi
7b77734e3c the ordered list of features names are now stored in a file after each step and re-load in the case of re-starting of the training
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2572 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-07 18:03:41 +00:00
nicolabertoldi
6d45d03f48 removed local references
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2571 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-07 17:34:34 +00:00
nicolabertoldi
3731f83b8d minor changes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2570 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-07 16:43:07 +00:00
nicolabertoldi
98387244c1 added a new regression test for --continue option of mert-moses-new.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2569 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-07 16:37:25 +00:00
nicolabertoldi
124f88e55a enabled the --continue option to re-start an interrupted mert from the last finished step
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2568 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-07 16:35:57 +00:00
nicolabertoldi
e25b8c41b7 minor change
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2567 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-07 16:33:32 +00:00
nicolabertoldi
40f9b00bab use of compressed data
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2566 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-07 12:47:40 +00:00
nicolabertoldi
2d1e4697f2 adding a new regression test
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2564 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-05 18:01:57 +00:00
nicolabertoldi
a93041d3d8 adding a new regression test
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2563 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-05 18:01:15 +00:00
nicolabertoldi
572f54f474 removing useless files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2562 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-05 17:59:41 +00:00
nicolabertoldi
fa6a5bfc35 with this change, the usage of initial points for mert works properly
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2558 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-01 16:07:13 +00:00
nicolabertoldi
d4083b1119 adding very basic regression regr-tests for mert-moses-new which use a virtual decoder simulating the generation of the nbest lists
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2557 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-01 15:42:55 +00:00
nicolabertoldi
3484a1fd93 fixed minor bugs
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2550 1f5c12ca-751b-0410-a591-d2e778427230
2009-10-01 07:44:30 +00:00
hieuhoang1972
9b18ec4a29 add release files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2527 1f5c12ca-751b-0410-a591-d2e778427230
2009-08-25 10:13:47 +00:00
nicolabertoldi
f75e3993ac correction of parameter description
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2462 1f5c12ca-751b-0410-a591-d2e778427230
2009-08-05 16:54:33 +00:00
nicolabertoldi
8384857f8c changes to mert-moses-new.pl to work with different reference length policies: shortest, average, closest (either "--shortest", "--average", or "--closest ) for BLEU and with case-sensitive/insensitive evaluation ( --nocase)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2461 1f5c12ca-751b-0410-a591-d2e778427230
2009-08-05 16:39:06 +00:00
hieuhoang1972
568e973b2e make gcc amd make calls consistent for eric to use in ubuntu package
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2347 1f5c12ca-751b-0410-a591-d2e778427230
2009-05-27 12:54:04 +00:00
phkoehn
8833098925 generalized n-best list reporting for feature functions, added experimental version of global lexical model
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2343 1f5c12ca-751b-0410-a591-d2e778427230
2009-05-26 19:30:35 +00:00
mphi
17c3cfffac added unpaired significance evaluation
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2328 1f5c12ca-751b-0410-a591-d2e778427230
2009-05-12 18:56:01 +00:00
bhaddow
981e440cc2 Fix detection of binarised reordering table
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2240 1f5c12ca-751b-0410-a591-d2e778427230
2009-03-13 12:28:34 +00:00
bhaddow
e1d7bb986c Add option for predictable seeding
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2220 1f5c12ca-751b-0410-a591-d2e778427230
2009-02-25 19:31:17 +00:00
phkoehn
8d5aef137b bug fix
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2113 1f5c12ca-751b-0410-a591-d2e778427230
2009-02-09 16:00:35 +00:00
phkoehn
a62f8ee316 added truecaser
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2112 1f5c12ca-751b-0410-a591-d2e778427230
2009-02-09 15:32:34 +00:00
jdschroeder
cc95706045 mert-moses.pl now supports multiple input weights for lattices and confusion networks, using the --inputweights argument.
I'll leave it to someone who knows mert-moses-new.pl better to make the changes there.

"zcat" is now abstracted as a $ZCAT variable in these files, and is set to "gzip -cd" which should work on more platforms (notably on the mac, where zcat fails unless an archive name ends in ".Z").

 


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2082 1f5c12ca-751b-0410-a591-d2e778427230
2009-02-05 17:39:36 +00:00
phkoehn
98381c0193 fixed xml removal
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1995 1f5c12ca-751b-0410-a591-d2e778427230
2009-01-24 05:21:36 +00:00
phkoehn
616842f278 fixed multi-bleu documentation
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1971 1f5c12ca-751b-0410-a591-d2e778427230
2009-01-08 00:47:10 +00:00
nicolabertoldi
2075f9dda1 modification to mert script to allow the use of fewer nbest lists; features and scores are no more gzipped
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1965 1f5c12ca-751b-0410-a591-d2e778427230
2008-12-30 17:33:16 +00:00
bojar
091c9ece28 raising line_max_length
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1953 1f5c12ca-751b-0410-a591-d2e778427230
2008-12-05 10:01:05 +00:00
bojar
586d7e2f84 minor fix when handling gzipped corpora
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1952 1f5c12ca-751b-0410-a591-d2e778427230
2008-12-04 17:49:59 +00:00
bojar
2c900c8bd7 uncompress input files for phrase extract, if needed
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1951 1f5c12ca-751b-0410-a591-d2e778427230
2008-11-27 10:39:28 +00:00
bhaddow
1e13f6d2d6 Weights can sometimes be in exponential format.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1947 1f5c12ca-751b-0410-a591-d2e778427230
2008-11-25 09:54:45 +00:00
hieuhoang1972
2807bc48ad absolute file name check, provided by Eric Kow
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1944 1f5c12ca-751b-0410-a591-d2e778427230
2008-11-20 13:03:59 +00:00
hieuhoang1972
254284e57e patch to fix fiddly env variable and directory stuff, provided by Eric Kow@
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1943 1f5c12ca-751b-0410-a591-d2e778427230
2008-11-20 13:01:49 +00:00
phkoehn
abb2fc37b1 proper binarization of lexicalized reordering model
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1938 1f5c12ca-751b-0410-a591-d2e778427230
2008-11-10 16:03:16 +00:00
phkoehn
bfbbefd710 bug fixes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1917 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-26 03:32:29 +00:00
mphi
8a4c6a2c63 pus significance test into proper location
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1915 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-23 09:16:33 +00:00
mphi
88d3b775ce altered the bootstrap significance script algorithm according to (Riezler and Maxwell 2005 @ MTSE'05)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1914 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-23 09:03:41 +00:00
phkoehn
a09242ad16 bug fix with phrase table name in moses.ini, when using hmm alignment
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1913 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-20 21:37:57 +00:00
mphi
f033e32979 Added implementation of Koehn's 2004 EMNLP paired bootsrap resampling
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1911 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-20 11:55:12 +00:00
phkoehn
1c7b305152 bug fix
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1910 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-19 07:32:48 +00:00
phkoehn
1b5d99ad26 added headers for standard compliance (gcc 4.3 on 64 bit linux)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1905 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-16 21:14:38 +00:00
phkoehn
3a5981ce9d major improvements, see email to moses-support
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1904 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-15 23:25:14 +00:00
phkoehn
614876771d extended extract/score, to allow for one big file, not just parts
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1903 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-15 22:12:56 +00:00
jdschroeder
78534c1518 made all zcat calls through ZCAT variable.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1875 1f5c12ca-751b-0410-a591-d2e778427230
2008-08-15 16:15:51 +00:00
jdschroeder
7a2ebedc20 minor bugfixes and error checking
-added -rootdir option to enhanced-mert
	-fixed float regex in score-nbest.py and mert-moses.pl
	-allow for extra weights in constructing ini in mert-moses.pl
	-additional NFS bug checks in mert-moses.pl



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1869 1f5c12ca-751b-0410-a591-d2e778427230
2008-08-01 10:24:01 +00:00
bojar
6a087d59c4 removed SCRIPTS_ROOTDIR from this 'my' declaration, it was obscuring previous
declaration!
lines wrapped


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1865 1f5c12ca-751b-0410-a591-d2e778427230
2008-07-14 16:24:18 +00:00
bojar
2afe9e0357 avoid coredump files in parallel moses (usually just kills NFS for a while),
debug on a smaller scale, if needed


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1864 1f5c12ca-751b-0410-a591-d2e778427230
2008-07-14 13:58:42 +00:00
bojar
c20e682f18 Avoid NFS race condition:
explicitly remove old cmert output files (hoping that they will be correctly
  replaced by a 'mv' in the shell script submitted to SGE by qsubwrapper
  occasionally reveals a race condition in NFS => weights seem unchanged =>
  mert finishes too early)


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1862 1f5c12ca-751b-0410-a591-d2e778427230
2008-07-10 11:47:55 +00:00
bhaddow
83f234cf17 Implementation of Cer et al mert regularisation. Use with argument such
as --scconfig regtype:min,regwin:3 in extractor and mert. Only tested
on toy example so far.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1860 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-24 19:27:18 +00:00
hieuhoang1972
52c2843e6c perl regexpr bug, submitted by German Sanchis Trilles
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1855 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-19 21:57:29 +00:00
bhaddow
4195b70247 First cut of new mert outer loop
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1842 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-10 09:07:20 +00:00
hieuhoang1972
1b44c7c445 most popular alignment outputted, finally
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1818 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-04 14:49:56 +00:00
hieuhoang1972
8554a7c89d most popular alignment outputted, finally
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1817 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-04 14:42:51 +00:00
hieuhoang1972
3832f68fed most popular alignment outputted, finally
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1816 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-04 14:40:04 +00:00
hieuhoang1972
bf34eb891d don't output alignment if inverse
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1813 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-03 12:25:37 +00:00
hieuhoang1972
b48ce341e9 output most aligned instead of merged
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1798 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-28 13:49:04 +00:00
phkoehn
7498f469ab get scripts rootdir by FindBin
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1745 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-16 15:54:02 +00:00
hieuhoang1972
a2a3d33103 explicitly use bash
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1693 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-15 08:50:22 +00:00
hieuhoang1972
3fc0b8ddb4 git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1603 1f5c12ca-751b-0410-a591-d2e778427230 2008-05-04 12:52:52 +00:00
nicolabertoldi
def5f419c2 - handling of word graph generation in the parallel environment (-output-word-graph)
- handling of '-' (i.e. /dev/stdout) for word graphs
- if either translations, nbests, searchgraphs or wordgraphs are output to stdout
  they are concatenated in this order
  BUT I STRONGLY RECOMMENT NO TO DO THAT


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1598 1f5c12ca-751b-0410-a591-d2e778427230
2008-04-23 10:03:45 +00:00
nicolabertoldi
d514c277df - handling of search graph generation in the parallel environment (-output-search-graph)
- modification of the parameter for nbest generation in the parallel environment:
  I make it similar to moses parameter (-n-best-list)
- handling of '-' (i.e. /dev/stdout) for nbest and search graphs


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1597 1f5c12ca-751b-0410-a591-d2e778427230
2008-04-23 08:37:44 +00:00
hieuhoang1972
a822d61d8f prevent -inf in lex re-ordering. Code contributed by Christian Hardmeier
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1596 1f5c12ca-751b-0410-a591-d2e778427230
2008-04-18 09:04:38 +00:00
nicolabertoldi
1aff3d2382 correct handling of binary phrase tables
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1579 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-28 14:33:15 +00:00
nicolabertoldi
def0fff5cd changes to handle lattice input format
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1578 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-28 08:55:40 +00:00
hieuhoang1972
0bb92c2e79 merge properly
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1577 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-27 19:01:38 +00:00
hieuhoang1972
cb1f0e56dc optional output what lines are retained
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1576 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-27 18:38:31 +00:00
bojar
f056bdbfde fixed to correctly handle models in [distortion-file] section
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1572 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-26 10:24:53 +00:00
bojar
3957dc6b4c default to reordering factors of 0-0 even if decoding steps are set (users
might have explicitly said e.g. t0-0!)


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1571 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-26 10:20:09 +00:00
hieuhoang1972
cced54cf7d win32 fix provided by jc read
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1569 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-22 21:47:32 +00:00
bojar
f7a1fb5b9c corpus compression correctly used even for generation step
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1568 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-22 16:14:30 +00:00
bojar
7f3e34207a added some heuristics for Czech quotation marks
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1567 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-22 15:07:46 +00:00
bojar
6af3140978 added optional sentence uppercasing (use -u)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1566 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-22 14:50:43 +00:00
bojar
8b3d44b2e2 SAFE_GETLINE made safer: will exit if the line does not fit into the buffer
instead of just going on and getting the src/tgt/alignment files out of sync


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1565 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-22 14:42:01 +00:00
bojar
f89ab590ec added Nicola's enhancedmert to released files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1564 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-22 13:30:05 +00:00
redpony
25750c6555 if giza returns sentences that have different lengths in different directions (due to truncation or other errors), don't silenty fail. print a blank line instead.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1562 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-19 20:48:14 +00:00
bojar
fa31d83421 even factors that are being added can be gzipped
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1561 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-19 17:32:51 +00:00
bojar
eec1bdb623 added support to open gzipped files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1560 1f5c12ca-751b-0410-a591-d2e778427230
2008-02-19 16:05:11 +00:00
nicolabertoldi
ae319da62b revert to /bin/sh for enhanced-mert; use of setenv (instead of export) in the csh scripts created by qsub-wrapper.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1515 1f5c12ca-751b-0410-a591-d2e778427230
2007-11-21 14:31:21 +00:00
nicolabertoldi
0176c5f8ec use fo csh instead of sh
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1514 1f5c12ca-751b-0410-a591-d2e778427230
2007-11-21 07:59:17 +00:00
bojar
89ea9828ba added ttable iterator to this script, too
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1498 1f5c12ca-751b-0410-a591-d2e778427230
2007-11-06 03:33:41 +00:00
bojar
09d8b5e657 improving documentation and allowing environment variables to override the
default paths


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1497 1f5c12ca-751b-0410-a591-d2e778427230
2007-11-06 03:16:17 +00:00
nicolabertoldi
568f92b310 bug fixed
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1491 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-30 15:42:26 +00:00
jdschroeder
e52040bc12 added str length check to stop std::out_of_range error in a few more spots - similar bug to one corrected in v. 1319
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1488 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-26 12:42:55 +00:00
nicolabertoldi
fd3ecd4334 bug fixed
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1487 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-25 14:34:49 +00:00
nicolabertoldi
1b0576ba6c small bug fixed: temporary concatenated sorted file is now deleted only at the end
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1486 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-24 07:50:51 +00:00
nicolabertoldi
8710cc9bc9 features can be activated using a comma- or blank-separated list
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1485 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-23 16:55:02 +00:00
nicolabertoldi
9e70b5ffd8 Features are activated using their names
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1484 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-23 16:30:02 +00:00
nicolabertoldi
8fe62f2b95 some small bugs fixed and clean up
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1483 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-23 14:15:19 +00:00
nicolabertoldi
4720d1cb9f bug fixed
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1482 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-22 17:01:53 +00:00
nicolabertoldi
918dae011a bug fixed
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1481 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-22 14:00:10 +00:00
nicolabertoldi
e7ac20d4d6 bug fixed in the name of a temporary file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1480 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-22 13:34:12 +00:00
nicolabertoldi
db9d0fc539 Added a more time-efficient (but more memory-consumptive) method to rescore nbest list
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1479 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-22 10:08:07 +00:00
nicolabertoldi
b827d51870 changes to cope with the new mert suite (enhanced-mert)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1478 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-22 09:45:19 +00:00
jdschroeder
a969197e16 Fixed passing decoder parameters when tuning on single machine.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1477 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-22 09:27:09 +00:00
nicolabertoldi
5759005857 Suite of scripts to perform MERT on a subset of fetures.
Look at the directory example to learn about its use.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1476 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-22 09:04:03 +00:00
redpony
0cf583e249 add --hmm-align option. Allows using Giza++'s HMM word alignment model as the underlying word alignment. It is much faster than Model 4 alignment and not much worse.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1474 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-19 03:44:05 +00:00
nicolabertoldi
901823d83a explicit export of PYTHONPATH variable
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1473 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-18 15:33:04 +00:00
nicolabertoldi
81b439d728 minor changes in passing parameters to moses-parallel
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1472 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-18 12:36:49 +00:00
hieuhoang1972
4e1cad4bbe fixed sync/async bug
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1471 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-04 17:31:31 +00:00
redpony
57dcaa8e80 performance fixes for scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1470 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-02 21:43:54 +00:00
hieuhoang1972
9cbc2922b4 separate word penalty for each decode step for async
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1469 1f5c12ca-751b-0410-a591-d2e778427230
2007-10-02 12:52:00 +00:00
hieuhoang1972
d2d03c33e7 fixed bug which prevented mert working when phrase table NOT filtered or binarised
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1449 1f5c12ca-751b-0410-a591-d2e778427230
2007-08-10 15:48:58 +00:00
hieuhoang1972
53fa2cb18a async - don't use binarising or filtering
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1443 1f5c12ca-751b-0410-a591-d2e778427230
2007-08-05 20:29:46 +00:00
hieuhoang1972
9eba034662 turn off debugging
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1435 1f5c12ca-751b-0410-a591-d2e778427230
2007-07-25 10:05:34 +00:00
hieuhoang1972
2beb0c44e9 mkdir before doing generation
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1425 1f5c12ca-751b-0410-a591-d2e778427230
2007-07-14 09:54:05 +00:00
nicolabertoldi
75afdf04a5 I corrected direction of alignment
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1421 1f5c12ca-751b-0410-a591-d2e778427230
2007-06-27 17:57:55 +00:00
nicolabertoldi
ac91cb78cc two additional (and simpler) ways of extracting alignments: source-to-target and target-to-source
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1420 1f5c12ca-751b-0410-a591-d2e778427230
2007-06-27 16:34:14 +00:00
nicolabertoldi
7f9c2856c2 changes to reduce disk memory consumption during training
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1419 1f5c12ca-751b-0410-a591-d2e778427230
2007-06-27 16:30:20 +00:00
phkoehn
960bebdd4a fixed clean script to handle '|'s
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1416 1f5c12ca-751b-0410-a591-d2e778427230
2007-06-18 15:50:04 +00:00
redpony
c747cdd505 fix dumb error
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1414 1f5c12ca-751b-0410-a591-d2e778427230
2007-06-01 16:27:56 +00:00
redpony
1f050e198a fix compile error, enable optimizations
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1413 1f5c12ca-751b-0410-a591-d2e778427230
2007-06-01 16:26:26 +00:00
redpony
564bb5a64e make scorer use compiler optimization
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1412 1f5c12ca-751b-0410-a591-d2e778427230
2007-06-01 16:22:30 +00:00