Commit Graph

394 Commits

Author SHA1 Message Date
hieuhoang1972
c1991f8a27 rewrite of lex prob calculation
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4069 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-07 09:29:03 +00:00
bojar
a57e71a13f cherrypick r3919: zero jobs means serial
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4067 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-03 21:41:19 +00:00
bojar
1ba2de3c02 - cmert: added support for passing min and max values for weights
(used to be in old cmert but not in new cmert, i.e. moses/mert/)
- modified mert-moses.pl accordingly, esp. set min&max to 0&1 as it used to be
  hardwired in the new cmert
- adding mert-moses-ondrej.pl, a simplification of mert-moses.pl, please test it


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4066 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-03 21:01:16 +00:00
hieuhoang1972
126739f3f1 debug info
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4063 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 10:33:04 +00:00
hieuhoang1972
4d66952b9b gcc
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4062 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 10:27:47 +00:00
hieuhoang1972
efc9c77de6 lex prob, almost working
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4061 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 10:22:55 +00:00
hieuhoang1972
d72b7cde92 makefile
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4060 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 09:06:17 +00:00
hieuhoang1972
dd9a9b6e43 vs.net
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4059 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 05:48:44 +00:00
hieuhoang1972
8595b06dce rewrite lex prob calc
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4058 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 05:40:46 +00:00
phkoehn
1c671787d4 minor & allows to specify a corpus for the generation model (-generation-corpus)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4055 1f5c12ca-751b-0410-a591-d2e778427230
2011-06-30 16:00:18 +00:00
hieuhoang1972
024b5f9100 vs.net build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4048 1f5c12ca-751b-0410-a591-d2e778427230
2011-06-28 19:38:57 +00:00
pjwilliams
0c60dd7ef8 filter-rule-table: allow for non-integral rule counts.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4036 1f5c12ca-751b-0410-a591-d2e778427230
2011-06-24 18:32:14 +00:00
pjwilliams
c14723cc83 Oops, fix commit 4032: option is called --PhrasePairCount not --RuleCount.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4034 1f5c12ca-751b-0410-a591-d2e778427230
2011-06-24 16:40:17 +00:00
pjwilliams
a1ca2722df Add --MinNonInitialRuleCount option to filter-model-given-input.pl. This
prunes non-initial rules (i.e. rules with non-terminals) from the rule table
based on their frequency counts.  In Zollmann, Venugopal, Och, and Ponte (2008),
pruning hierarchical rules that occur only once was found to significantly
decrease rule table size without harming translation quality.

Also, add TUNING:filter-settings and EVALUATION[:<set>]:filter-settings
variables so that this can be enabled in the EMS.

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4033 1f5c12ca-751b-0410-a591-d2e778427230
2011-06-24 16:36:27 +00:00
pjwilliams
108dc4d12e Add --PhrasePairCount option to score.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4032 1f5c12ca-751b-0410-a591-d2e778427230
2011-06-24 16:24:33 +00:00
pjwilliams
0484d43a22 train-model.perl: don't write obsolete glue-rule-type option to config.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4031 1f5c12ca-751b-0410-a591-d2e778427230
2011-06-24 09:38:53 +00:00
hieuhoang1972
7408636328 add --MaxLinesGTDiscount to usage display
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3987 1f5c12ca-751b-0410-a591-d2e778427230
2011-05-22 16:02:26 +00:00
dowobeha
9375aa8846 Reverting changes. Revision 3971 was a bad commit.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3972 1f5c12ca-751b-0410-a591-d2e778427230
2011-05-13 18:27:30 +00:00
dowobeha
bb941c01f6 Merge branch 'master' into local-trunk
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3971 1f5c12ca-751b-0410-a591-d2e778427230
2011-05-13 18:07:21 +00:00
jhclark
fb1a668bba Allow absoultizing binary lexro models, too. We don't want them getting jealous of binarized phrase tables.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3955 1f5c12ca-751b-0410-a591-d2e778427230
2011-05-05 00:54:14 +00:00
hieuhoang1972
2af2ea4746 minor gcc error
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3953 1f5c12ca-751b-0410-a591-d2e778427230
2011-05-01 11:39:52 +00:00
pjwilliams
b186fcd2c7 Simple SCFG rule extraction speed-ups based on callgrind profile.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3946 1f5c12ca-751b-0410-a591-d2e778427230
2011-04-07 11:03:23 +00:00
bojar
65048a3714 zero jobs means serial
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3919 1f5c12ca-751b-0410-a591-d2e778427230
2011-03-09 10:43:34 +00:00
hieuhoang1972
1e76baa978 #include for Ubuntu build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3918 1f5c12ca-751b-0410-a591-d2e778427230
2011-03-08 15:45:03 +00:00
hieuhoang1972
2880656d8d option of outputting scoring to stdout
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3914 1f5c12ca-751b-0410-a591-d2e778427230
2011-03-07 02:44:34 +00:00
hieuhoang1972
cd384a1fbc option of outputting scoring to stdout
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3913 1f5c12ca-751b-0410-a591-d2e778427230
2011-03-05 15:38:50 +00:00
hieuhoang1972
a3d97584a9 run beautify.perl. Consistent formatting for .h & .cpp files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3902 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-24 13:57:11 +00:00
phkoehn
4c11bcd617 extensions to phrase table scoring options
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3893 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-23 10:27:54 +00:00
bhaddow
47df5fd51c Add triples with default values if insufficient number are supplied. Note
that min and max are no longer used, and should be removed at some point.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3874 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-08 16:26:17 +00:00
bhaddow
7651f3bb42 Fix command line parsing, update for new ttable format
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3868 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-03 09:55:08 +00:00
hieuhoang1972
fe0d53b73f vs.net
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3801 1f5c12ca-751b-0410-a591-d2e778427230
2011-01-17 15:48:46 +00:00
rafpayen
98495fc02c give absolute path for glue-grammar instead of ./model
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3793 1f5c12ca-751b-0410-a591-d2e778427230
2011-01-07 13:16:15 +00:00
nicolabertoldi
dbad1bb7aa now mert-moses.pl correctly call Moses for generating nbest
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3782 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-15 14:49:34 +00:00
pjwilliams
3dec57a518 When scoring phrase pairs, store copies of the active pairs' PHRASE objects
instead of inserting them into a PhraseTable.  In a test on a 21GB
target-syntax extract file, this reduced user time from 195 to 120 mins.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3777 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-14 23:49:57 +00:00
pjwilliams
627d8edf8e Fix bug affecting Good-Turing discounting: repeated phrase pairs were always
contributing a count of 1 because PhraseAlignment::addToCount() was looking
for counts in the fifth column, not the fourth.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3775 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-14 16:31:53 +00:00
dowobeha
44b3af7cac Re-enabled --skip-decoder in mert-moses.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3759 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-03 16:44:13 +00:00
rafpayen
be92193c03 fix for multiple whitespace in dictionary
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3750 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-30 11:16:07 +00:00
rafpayen
51fd4afb79 add giza dictionary option
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3749 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-30 11:05:09 +00:00
hieuhoang1972
71093403df use gzipped extract file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3736 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-25 13:54:40 +00:00
hieuhoang1972
dd6c1e722e use gzipped extract file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3729 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-23 14:30:36 +00:00
hieuhoang1972
867a9bdf4b use gzipped extract file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3728 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-23 14:15:54 +00:00
hieuhoang1972
4bc0a8e6b2 can set max num of lines for GT discount calc.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3723 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-19 20:11:10 +00:00
bojar
6616dd3f62 prettified usage string
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3714 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-16 00:26:50 +00:00
bojar
5c3a38bc2e fixed behaviour wrt to weight-d, don't expect it unconditionally as moses-chart
does not use it


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3713 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-16 00:00:17 +00:00
hieuhoang1972
57e3a92836 rollback. argument not supported by all iconv
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3712 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-15 12:50:11 +00:00
hieuhoang1972
ff339e56e3 don't drop unknown char. replace it with improbable string. avoid misalignment
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3709 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-14 20:50:15 +00:00
hieuhoang1972
f7904a871c add scripts to exclude unparseable sentences
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3704 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-12 14:43:52 +00:00
hieuhoang1972
687cf9bf29 add scripts to exclude unparseable sentences
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3702 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-12 14:20:11 +00:00
hieuhoang1972
a79a6bbaec add scripts to exclude unparseable sentences
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3700 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-11 18:04:16 +00:00
hieuhoang1972
f1f04daa0a add empty line if input is empty line
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3699 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 12:11:55 +00:00