bojar
c954a507c6
fixed error message in zmert-moses.pl
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3105 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:46:49 +00:00
bojar
8e2e4eeecc
bug fixed
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3104 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:46:34 +00:00
bojar
b8a6048e81
ZMERTSEMPOSSOURCE=factors working for SemPOS_BLEU
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3103 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:46:17 +00:00
bojar
866776810b
typo fixed
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3102 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:45:59 +00:00
bojar
b87370aec2
few bugs fixed in ZMERTSEMPOSSOURCE=factors part
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3101 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:45:43 +00:00
bojar
19184783a9
fixed bug in --mert-verbose parameter
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3100 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:45:28 +00:00
bojar
9f3d2f427d
fixed nbest-list conversion
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3098 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:44:41 +00:00
bojar
2fb22064a3
bug in passing nbest-lists to zmert
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3097 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:44:21 +00:00
bojar
cd64a02344
fixed bug - missing loop to copy one file to another
...
Conflicts:
scripts/training/zmert-moses.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3096 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:43:55 +00:00
bojar
b276bacb72
bug in foreach
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3095 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:43:27 +00:00
bojar
94eb8a5c1b
added default parameters for basic metrics
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3094 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:43:10 +00:00
bojar
2604a890d7
zmert training with new metric SemPOS_BLEU - linear combination of SemPOS and BLEU
...
SemPOS_BLEU requires new TMT block ForSemPOSBLEUMetric.pm in TectoMT. Please,
update your TMT.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3093 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:42:49 +00:00
bojar
5b1541b96d
Fixed order of feature scores passed to zmert. Evaluation using BLEU works.
...
Zmert uses different order of features than Moses. It is necessary to reorder
them when passing to Zmert in nbest-lists.
Previous versions used wrong copy of moses.ini, which linked to phrase tables
that were already filtered for tuning. Thus, phrase tables for evaluation were
missing a lot of phrases and the reported scores were too low.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3092 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:42:20 +00:00
bojar
5a1c308673
Setting number of zmert iterations back to unlimited in zmert config.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3091 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:41:55 +00:00
bojar
671e7e2e4f
Default size of nbest-list in zmert-moses.pl set back to 100.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3090 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:41:31 +00:00
bojar
1ce9a15483
Bug fixes im zmert-moses.pl and zmert.jar. Zmert works.
...
Now it is possible to lauch Zmert with SemPOS metric.
It is possible to select a smaller model for McD parser by uncommenting line
with pdt20_train_autTag_golden_latin2_pruned_0.10.model in file zmert.tmt-scen.
Sentences are then analyzed faster (if you use TectoMT to get SemPOS tags).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3089 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:41:11 +00:00
bojar
1b28cde2e8
small fixes of paths
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3088 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:40:29 +00:00
bojar
d47b82369f
some small code modifications in zmert-moses.pl
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3087 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:39:47 +00:00
bojar
d34af9b769
updated zmert-moses.pl - zmert with sempos still not running
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3086 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:39:31 +00:00
bojar
49bb6b8882
Updated version of zmert training - still not finished SemPOS factor loading via TMT
...
zmert-moses.pl - launches zmert training
zmert-decoder.pl - an is launched by zmert after each training iteration to compute scores
with updated lambdas
zmert.tmt-scen - TMT scenario to extract SemPOS factor for sentence without any factors
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3085 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:39:14 +00:00
bojar
90fecff3aa
New mert training (Zmert) for Moses\n\nZmert jar includes SemPOS metric extension.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3084 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 22:38:53 +00:00
pjwilliams
b3f6e211fd
Fix mistakes in previous commit (oh, and revert to my own svn username
...
to prevent my shoddy check-ins from further sullying Hieu's good name).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3083 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 14:02:58 +00:00
hieuhoang1972
c6d20e1f9f
Update the training scripts to support the new format parameter for
...
'ttable-file'
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3082 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-09 11:37:43 +00:00
sarst
7d3a22c8a7
Bugfixes for the new lexical reordering. Running without a reordering model, and with the reordering flag specyfying distance now works. Formatting of the extract.o file is now correct with 'f'-reordering models.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3054 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-06 12:08:04 +00:00
bojar
f703de9377
adding a script by Pranava Swaroop for Bing translation
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3018 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-02 13:46:46 +00:00
bojar
1883ea180c
unescaping chars that google escapes
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3017 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-02 13:46:31 +00:00
bhaddow
bd9f392875
Fix for training with non-lexicalised reordering
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3016 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-01 14:05:58 +00:00
sarst
943275e331
Set the default debug mode to 0 in train-factored-phrase-model.perl (which was the case before merging the hierarchical reordering branch to trunk)
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3009 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-30 12:09:19 +00:00
bhaddow
9573147e36
Fix --reordering-table option
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3004 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-26 09:32:15 +00:00
bhaddow
8060024cc1
Can now specify reordering table when executing step 9
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2997 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-23 18:55:48 +00:00
bhaddow
a9920a68e1
remove extra . from ini
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2991 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-20 21:18:55 +00:00
bhaddow
4b6d3dddd2
fix error in merge
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2990 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-19 18:35:48 +00:00
bhaddow
795224736b
Merge revisions 2670-2988 from track. Passes all regression except lexicalised
...
reordering
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2989 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-19 17:52:51 +00:00
bgottesman
5a3a6bd3b0
set utf8 mode on the input and output files, instead of on stdin and stdout, which are not used. This allows case variants of non-ASCII characters to be recognized correctly
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2987 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-18 19:13:05 +00:00
bgottesman
872d50dd9b
Fix a line that extracts the source FACTORLIST from a FACTORMAP (in the terms of the Moses manual) so that it succeeds even if the target FACTORLIST contains commas, i.e. contains multiple factors
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2965 1f5c12ca-751b-0410-a591-d2e778427230
2010-03-10 18:03:53 +00:00
bojar
d7335c0de1
avoid reducing corpus file if it already contains the needed factors
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2945 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-24 16:14:07 +00:00
sarst
435539d4ed
Added an option to choose exactly which steps to run, not only the first and the last (since we sometimes need to do step 5 and 7 to retrain lexical reordering)
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2917 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-19 18:03:44 +00:00
sarst
b95cc2f556
Added the check from word-based models of the alignment points in the adjacent corners, to the more complex models.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2916 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-19 15:15:24 +00:00
sarst
bdfe47f99b
added option to skip phrase-scoring, when we want to train new reordering models on old trainings
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2914 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-18 21:16:00 +00:00
sarst
03e24c9856
changed the maximum model config, to always print all models
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2912 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-18 13:45:15 +00:00
sarst
a86470c974
lexical reordering models: it is now possible to add the maximal reordering orientations to the extract file, and the collapsing information is no longer part of the filename. Also removed some old variables, that are not used anymore.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2911 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-18 13:07:32 +00:00
hieuhoang1972
ad3b032ebf
roll back
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2910 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-18 11:00:40 +00:00
chardmeier
b10d8eb469
Added lex reo scorer to released-files and removed extra ||| from reordering table.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2894 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-12 16:23:32 +00:00
sarst
c65945b531
Cleaned up lescial reordering scoring, and sent vectors as references instead of copying them. Fixed bugs in extract: it used to choose the wrong orientation at end of sentences, and the hierarchical model typ is no longer dependent on the phrase-based model type.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2892 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-12 13:46:33 +00:00
sarst
e812b2a8e3
Bugfixes: printing the correct phrases in the table, and fixed misspelling of monotonicity.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2885 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-11 09:09:44 +00:00
sarst
92368ba490
Rewrote the lexical reordering model scoring in C++. Adapted train-factored-phrase-model.perl to that change. Minor fixes in other places, for compatibility
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2884 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-10 17:19:06 +00:00
mphi
9e8352a041
modified the implementation, removing unnecessary repetition, thus making the whole process approximately fifty times faster
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2866 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-07 09:11:09 +00:00
mphi
05e21dc5e2
fixed compatibility of the --final-alignment-model 2 switch and mgiza (ibm2 is done in a single thread, hence no *.part* files)
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2859 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-04 16:12:57 +00:00
bojar
ff05e5a1b5
list frequent mismatched tokenizations first
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2852 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 16:37:08 +00:00
bojar
9b10946f10
fixed regexes to read current -osg format
...
verbose at bad lines
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2850 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:35:21 +00:00