Commit Graph

535 Commits

Author SHA1 Message Date
jhclark
fb1a668bba Allow absoultizing binary lexro models, too. We don't want them getting jealous of binarized phrase tables.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3955 1f5c12ca-751b-0410-a591-d2e778427230
2011-05-05 00:54:14 +00:00
hieuhoang1972
2af2ea4746 minor gcc error
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3953 1f5c12ca-751b-0410-a591-d2e778427230
2011-05-01 11:39:52 +00:00
pjwilliams
b186fcd2c7 Simple SCFG rule extraction speed-ups based on callgrind profile.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3946 1f5c12ca-751b-0410-a591-d2e778427230
2011-04-07 11:03:23 +00:00
bojar
65048a3714 zero jobs means serial
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3919 1f5c12ca-751b-0410-a591-d2e778427230
2011-03-09 10:43:34 +00:00
hieuhoang1972
1e76baa978 #include for Ubuntu build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3918 1f5c12ca-751b-0410-a591-d2e778427230
2011-03-08 15:45:03 +00:00
bojar
4bb2cd5994 support for --alignment-output-file (pass it to moses and later concatenate
outputs)


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3917 1f5c12ca-751b-0410-a591-d2e778427230
2011-03-08 07:02:49 +00:00
hieuhoang1972
2880656d8d option of outputting scoring to stdout
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3914 1f5c12ca-751b-0410-a591-d2e778427230
2011-03-07 02:44:34 +00:00
hieuhoang1972
cd384a1fbc option of outputting scoring to stdout
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3913 1f5c12ca-751b-0410-a591-d2e778427230
2011-03-05 15:38:50 +00:00
hieuhoang1972
a3d97584a9 run beautify.perl. Consistent formatting for .h & .cpp files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3902 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-24 13:57:11 +00:00
hieuhoang1972
4eb32d3f76 avoid mangling *.hh in kenlm
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3894 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-23 15:28:10 +00:00
phkoehn
4c11bcd617 extensions to phrase table scoring options
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3893 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-23 10:27:54 +00:00
bhaddow
c86e6b38b3 add new nonbreaking prefixes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3884 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-17 21:51:17 +00:00
phkoehn
df901e7ce6 added files from Tom Hoar
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3881 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-16 10:44:26 +00:00
bojar
76174ccd4b mark web/bin/detokenizer.perl as outdated
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3880 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-14 13:35:04 +00:00
bojar
26ccace946 Czech detokenization
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3879 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-14 13:32:41 +00:00
bhaddow
47df5fd51c Add triples with default values if insufficient number are supplied. Note
that min and max are no longer used, and should be removed at some point.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3874 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-08 16:26:17 +00:00
bhaddow
7651f3bb42 Fix command line parsing, update for new ttable format
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3868 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-03 09:55:08 +00:00
bojar
72945c543e a script to convert AT&T FSA to 'python lattice format' that Moses reads
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3866 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-03 08:32:44 +00:00
hieuhoang1972
fe0d53b73f vs.net
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3801 1f5c12ca-751b-0410-a591-d2e778427230
2011-01-17 15:48:46 +00:00
bhaddow
a9cd71628a Change of boost macros - please make sure you favourite configuration still works
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3799 1f5c12ca-751b-0410-a591-d2e778427230
2011-01-13 23:38:48 +00:00
suzyh
994ccb12c2 Fix to EMS call of multi-bleu evaluation script
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3798 1f5c12ca-751b-0410-a591-d2e778427230
2011-01-13 04:53:09 +00:00
rafpayen
98495fc02c give absolute path for glue-grammar instead of ./model
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3793 1f5c12ca-751b-0410-a591-d2e778427230
2011-01-07 13:16:15 +00:00
nicolabertoldi
dbad1bb7aa now mert-moses.pl correctly call Moses for generating nbest
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3782 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-15 14:49:34 +00:00
nicolabertoldi
ab2185c4a5 more robust behavior of qsub-wrapper.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3781 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-15 14:47:51 +00:00
pjwilliams
3dec57a518 When scoring phrase pairs, store copies of the active pairs' PHRASE objects
instead of inserting them into a PhraseTable.  In a test on a 21GB
target-syntax extract file, this reduced user time from 195 to 120 mins.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3777 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-14 23:49:57 +00:00
pjwilliams
627d8edf8e Fix bug affecting Good-Turing discounting: repeated phrase pairs were always
contributing a count of 1 because PhraseAlignment::addToCount() was looking
for counts in the fifth column, not the fourth.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3775 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-14 16:31:53 +00:00
bhaddow
4174082396 Non-breaking prefixes for Dutch
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3764 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-08 16:09:24 +00:00
dowobeha
44b3af7cac Re-enabled --skip-decoder in mert-moses.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3759 1f5c12ca-751b-0410-a591-d2e778427230
2010-12-03 16:44:13 +00:00
rafpayen
be92193c03 fix for multiple whitespace in dictionary
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3750 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-30 11:16:07 +00:00
rafpayen
51fd4afb79 add giza dictionary option
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3749 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-30 11:05:09 +00:00
mphi
ddabdf6b1b added support for arbitrary encodings via the $IO_ENCODING global variable on line 23; set to UTF8 by default
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3739 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-29 09:04:44 +00:00
hieuhoang1972
71093403df use gzipped extract file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3736 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-25 13:54:40 +00:00
hieuhoang1972
dd6c1e722e use gzipped extract file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3729 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-23 14:30:36 +00:00
hieuhoang1972
867a9bdf4b use gzipped extract file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3728 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-23 14:15:54 +00:00
hieuhoang1972
6f5d1e4732 deleting offending comment
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3724 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-21 16:35:31 +00:00
hieuhoang1972
4bc0a8e6b2 can set max num of lines for GT discount calc.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3723 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-19 20:11:10 +00:00
bojar
6616dd3f62 prettified usage string
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3714 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-16 00:26:50 +00:00
bojar
5c3a38bc2e fixed behaviour wrt to weight-d, don't expect it unconditionally as moses-chart
does not use it


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3713 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-16 00:00:17 +00:00
hieuhoang1972
57e3a92836 rollback. argument not supported by all iconv
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3712 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-15 12:50:11 +00:00
hieuhoang1972
ff339e56e3 don't drop unknown char. replace it with improbable string. avoid misalignment
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3709 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-14 20:50:15 +00:00
hieuhoang1972
f7904a871c add scripts to exclude unparseable sentences
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3704 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-12 14:43:52 +00:00
hieuhoang1972
687cf9bf29 add scripts to exclude unparseable sentences
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3702 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-12 14:20:11 +00:00
hieuhoang1972
a79a6bbaec add scripts to exclude unparseable sentences
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3700 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-11 18:04:16 +00:00
hieuhoang1972
f1f04daa0a add empty line if input is empty line
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3699 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 12:11:55 +00:00
bojar
ff56054a03 removed --inputweights, read this information from link-param-count instead
added negatable --starting-weights-from-ini (defaulting to yes)
improved documentation of --activate-features


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3697 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 11:25:40 +00:00
bojar
9838de2a81 handle also gzipped ini files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3696 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 11:21:28 +00:00
nicolabertoldi
d38b319405 workaround to force the use of the bash shell in the SGE
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3695 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 10:32:34 +00:00
bgottesman
518035ed05 add --possiblyUseFirstToken option, which, when selected, allows certain sentence-initial tokens to be taken into account. See comment in header or support mailing list discussion.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3690 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-09 11:05:23 +00:00
phkoehn
7334d49191 minor experiment.perl fixes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3668 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-27 12:42:34 +00:00
pjwilliams
3ca16120a2 Add --MaxScope option to extract-rules (Hopkins and Langmead, 2010)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3661 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:55:57 +00:00