Commit Graph

514 Commits

Author SHA1 Message Date
bhaddow
e664b4a4b3 Merge 3791-3842 from trunk
git-svn-id: http://svn.statmt.org/repository/mira@3873 cc96ff50-19ce-11e0-b349-13d7f0bd23df
2011-08-18 12:59:36 +02:00
nicolabertoldi
dbad1bb7aa now mert-moses.pl correctly call Moses for generating nbest
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3782 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-15 14:49:34 +00:00
nicolabertoldi
ab2185c4a5 more robust behavior of qsub-wrapper.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3781 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-15 14:47:51 +00:00
pjwilliams
3dec57a518 When scoring phrase pairs, store copies of the active pairs' PHRASE objects
instead of inserting them into a PhraseTable.  In a test on a 21GB
target-syntax extract file, this reduced user time from 195 to 120 mins.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3777 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-14 23:49:57 +00:00
pjwilliams
627d8edf8e Fix bug affecting Good-Turing discounting: repeated phrase pairs were always
contributing a count of 1 because PhraseAlignment::addToCount() was looking
for counts in the fifth column, not the fourth.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3775 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-14 16:31:53 +00:00
bhaddow
4174082396 Non-breaking prefixes for Dutch
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3764 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-08 16:09:24 +00:00
dowobeha
44b3af7cac Re-enabled --skip-decoder in mert-moses.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3759 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-03 16:44:13 +00:00
rafpayen
be92193c03 fix for multiple whitespace in dictionary
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3750 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-30 11:16:07 +00:00
rafpayen
51fd4afb79 add giza dictionary option
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3749 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-30 11:05:09 +00:00
mphi
ddabdf6b1b added support for arbitrary encodings via the $IO_ENCODING global variable on line 23; set to UTF8 by default
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3739 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-29 09:04:44 +00:00
hieuhoang1972
71093403df use gzipped extract file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3736 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-25 13:54:40 +00:00
hieuhoang1972
dd6c1e722e use gzipped extract file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3729 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-23 14:30:36 +00:00
hieuhoang1972
867a9bdf4b use gzipped extract file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3728 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-23 14:15:54 +00:00
hieuhoang1972
6f5d1e4732 deleting offending comment
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3724 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-21 16:35:31 +00:00
hieuhoang1972
4bc0a8e6b2 can set max num of lines for GT discount calc.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3723 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-19 20:11:10 +00:00
bojar
6616dd3f62 prettified usage string
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3714 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-16 00:26:50 +00:00
bojar
5c3a38bc2e fixed behaviour wrt to weight-d, don't expect it unconditionally as moses-chart
does not use it


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3713 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-16 00:00:17 +00:00
hieuhoang1972
57e3a92836 rollback. argument not supported by all iconv
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3712 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-15 12:50:11 +00:00
hieuhoang1972
ff339e56e3 don't drop unknown char. replace it with improbable string. avoid misalignment
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3709 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-14 20:50:15 +00:00
hieuhoang1972
f7904a871c add scripts to exclude unparseable sentences
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3704 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-12 14:43:52 +00:00
hieuhoang1972
687cf9bf29 add scripts to exclude unparseable sentences
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3702 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-12 14:20:11 +00:00
hieuhoang1972
a79a6bbaec add scripts to exclude unparseable sentences
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3700 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-11 18:04:16 +00:00
hieuhoang1972
f1f04daa0a add empty line if input is empty line
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3699 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 12:11:55 +00:00
bojar
ff56054a03 removed --inputweights, read this information from link-param-count instead
added negatable --starting-weights-from-ini (defaulting to yes)
improved documentation of --activate-features


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3697 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 11:25:40 +00:00
bojar
9838de2a81 handle also gzipped ini files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3696 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 11:21:28 +00:00
nicolabertoldi
d38b319405 workaround to force the use of the bash shell in the SGE
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3695 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 10:32:34 +00:00
bgottesman
518035ed05 add --possiblyUseFirstToken option, which, when selected, allows certain sentence-initial tokens to be taken into account. See comment in header or support mailing list discussion.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3690 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-09 11:05:23 +00:00
phkoehn
7334d49191 minor experiment.perl fixes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3668 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-27 12:42:34 +00:00
pjwilliams
3ca16120a2 Add --MaxScope option to extract-rules (Hopkins and Langmead, 2010)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3661 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:55:57 +00:00
bojar
c0e0bc62c6 fixed a stupid bug from last commit
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3660 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:23:31 +00:00
bojar
878c7100de accept binarized ttables as well
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3659 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:22:01 +00:00
bojar
8cfc403fec default location of new mert
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3658 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:21:38 +00:00
phkoehn
c8ae94e426 training for global lexicon model
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3655 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-25 16:24:59 +00:00
chardmeier
ecf4b0d368 Check for right boost version in memscore.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3640 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-22 14:36:43 +00:00
phkoehn
ace33d16dd bug fix
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3636 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-22 07:30:52 +00:00
phkoehn
3b880bbdda added biconcor to make
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3634 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-21 10:04:38 +00:00
phkoehn
85a5a13e4c improvements to web analysis, fixes to syntax wrappers
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3633 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-21 09:49:27 +00:00
bhaddow
88eaf49c5e remove detokeniser
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3632 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-21 08:42:28 +00:00
sarst
0594b13c61 Added nonbreaking_prefix.sv for Swedish
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3630 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-19 12:45:49 +00:00
bhaddow
2dc951b062 More informative error messages
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3625 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-15 09:00:18 +00:00
rafpayen
a1ab166692 reset file handle between opens, so as to have an error if no file is given
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3623 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-14 18:23:05 +00:00
hieuhoang1972
e5edb4b971 delete duplicate detokenizer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3622 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-13 16:39:46 +00:00
hieuhoang1972
08739d8f49 add from josh's script
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3621 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-13 10:14:13 +00:00
hieuhoang1972
eedef63277 keep perl scripts with Unix line endings
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3612 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-11 11:32:27 +00:00
hieuhoang1972
105e83df82 beautify
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3610 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-07 19:08:44 +00:00
suzyh
d071296cde Fix to reuse-weights.perl to copy weights containing an exponent
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3590 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-29 04:04:12 +00:00
rsennrich
7929e4624e more informative error message when hierarchical phrase extraction fails.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3550 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-22 12:56:11 +00:00
rosasjolu
8746482d04 Change data files location
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3549 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-22 11:03:36 +00:00
rosasjolu
16302a45a8 git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3548 1f5c12ca-751b-0410-a591-d2e778427230 2010-09-22 11:02:29 +00:00
rosasjolu
d2fd75ac49 Change data files location
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3547 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-22 10:27:07 +00:00