Commit Graph

981 Commits

Author SHA1 Message Date
hieuhoang1972
e6b7866f4a get rid of warning message in srilm class
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1251 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-28 11:59:32 +00:00
phkoehn
f1d2bd0eb5 added option -include-alignment-in-n-best to include the word alignment for each sentence in the n-best list file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1246 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-26 20:59:41 +00:00
hieuhoang1972
3413bf7046 visual studio output paths
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1245 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-26 16:49:32 +00:00
jorcisai
ad28cee802 distortion filename was incorrectly written into moses.ini file in step 9
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1243 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-26 14:36:53 +00:00
phkoehn
a89acb34ae minor bug fix to recaser training
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1242 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-26 12:19:06 +00:00
hieuhoang1972
62b4741de0 move calling InitializeBeforeSentenceProcessing() & CleanUpAfterSentenceProcessing() in Manager class
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1241 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-24 20:32:32 +00:00
jorcisai
b361817067 In old_sge mode: sync script name is now prefixed by ${jobscript} to be able to run several moses_parallel.pl in parallel. Also a new function check_translation_old_sge was added, this function is derived from the former check_translation function
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1239 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-23 18:08:47 +00:00
jorcisai
d5b4565f23 language model parser for --lm option is now again able to parse $type, but it is backward compatible
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1238 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-23 17:32:51 +00:00
jorcisai
c69bd4079b reordering model was left in the local directory instead of model directory
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1236 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-23 12:27:50 +00:00
jorcisai
872f2d3612 Trying to parse $type in --lm option, but not available. So we just need to parse three tokens.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1235 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-23 11:49:54 +00:00
hieuhoang1972
bef38f4006 code cleanup
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1234 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-23 00:00:09 +00:00
hieuhoang1972
a1072b9a7a more verbose=0
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1233 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-22 23:54:59 +00:00
hieuhoang1972
c58393a4b4 verbose=0 nothing goes to stderr except for real, aborting errors
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1232 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-22 23:44:38 +00:00
phkoehn
6c5cb3a6ec changes to fit with edinburgh setup, added switch -generation-type: "single" only produces one probability, not both
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1231 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-22 19:37:11 +00:00
hieuhoang1972
8048aefeb0 fixed mem leak
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1230 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-22 12:04:53 +00:00
hieuhoang1972
6b611279d5 minor gcc compile error.
also, no longer use IRSTLM as a subsitutute for SRILM, and vice versa. They don't give identitcal results - avoids confusion.

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1229 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-21 20:05:27 +00:00
hieuhoang1972
b62dda41ed change unknown word processing to be closer to the way pharaoh does it - create unknown word whenever single word is not in translation table but penalise hypothesis for using it.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1228 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-21 19:51:17 +00:00
hieuhoang1972
7ecb0ce66e change unknown word processing to be closer to the way pharaoh does it - create unknown word whenever single word is not in translation table but penalise hypothesis for using it.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1227 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-21 19:48:53 +00:00
jdschroeder
9576345394 added recaser scripts to released-files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1226 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-21 15:47:13 +00:00
hieuhoang1972
7e0261b901 hack to fix hypo collection where all hypo scores are -inf. need to rethink pruning or creation of trans opt for unknown word
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1225 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-20 11:08:55 +00:00
phkoehn
9f227aa26b minor bug fix with config file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1224 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-19 18:35:51 +00:00
hieuhoang1972
53578eda97 minor gcc compile error
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1219 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-16 18:15:06 +00:00
hieuhoang1972
f3cbacba3e code cleanup - make FactorCollection and StaticData totally accessible only globally
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1218 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-16 18:08:37 +00:00
lexi_birch
59c4ba9f4d Merging from branch.
From now on, can have multiple decoder step lists to accomodate backoff
Specify this as an extra parameter in the [mapping] option in the ini file
This is backwards compatible.
Before (and still accepted):
[mapping]
T 0

Now you can have:
[mapping]
0 T 0
1 T 1
1 G 0

Imagine for instance the translation table 0 is words - words, 
and the table 1 is stems - stems, and the generation table 0
is stems - words. This will allow us to backoff to stems if
words are not found.

It is not really backoff because all the options from both
decoder step lists get included into the translation option collection,
which is then used to create the hypotheses.
The different paths must have their weights carefully balanced.
MERT might not be enough to discover the best weights for all the
combined parameters. 




git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1217 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-16 15:56:44 +00:00
bojar
2f4c70b4ae Die if aclocal, autoconf or automake fail
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1214 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-15 01:50:26 +00:00
bojar
6eacf476f0 Die if aclocal, autoconf or automake fail.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1213 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-15 01:49:37 +00:00
hieuhoang1972
4237cba9c3 check in eclipse proj to make bin table
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1212 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-14 20:32:57 +00:00
phkoehn
14839768c8 a large number of changes. besides little tweaks:
* training script now has proper default behaviour for single-factor models, 
* mert script has better handling of default lambda parameters that now
  works with lexicalized reordering models, and also with multiple 
  models files (e.g. multiple language models)
* parallel mert script is more robust when single jobs fail: detects it
  and resubmits the crashed (or killed) jobs
* recaser added that builds on moses
* filtering script added that also binarizes filtered model files
  (this will be eventually replaced when the lexicalized reordering
  model also uses the binary format)


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1210 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-13 19:22:35 +00:00
hieuhoang1972
e247f1da6f fixed regression test failing. Number of features for generation models MUST be specified in ini file, no backward compatability hack
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1209 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-13 19:15:34 +00:00
hieuhoang1972
6b4dfc4db2 added #def to use hypo pool
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1206 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-12 11:05:13 +00:00
hieuhoang1972
4a30043757 remove irstlm vs project
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1204 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-12 10:21:35 +00:00
hieuhoang1972
ced1a06fff minor fn name change
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1203 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-10 19:40:36 +00:00
hieuhoang1972
2b9fc4b5cc minor fn name change
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1202 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-10 19:31:43 +00:00
hieuhoang1972
79b01784af minor tweak
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1201 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-10 19:22:40 +00:00
hieuhoang1972
1b2f95ad6a create eclipse project for processing bin phrase table
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1200 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-09 19:32:53 +00:00
hieuhoang1972
006e2724e0 take out irstlm on VS build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1195 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-09 17:36:07 +00:00
maurocettolo
7c7ee97f14 Minor revisions on consistency checks of IRSTLM package
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1190 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-09 13:44:46 +00:00
maurocettolo
075c88cd14 - update of irstlm library: files larger than 4Gb can be handled in
mmap (see irstlm/src/*cpp and irstlm/src/*h)
- fixed a bug in querying IRST LMs with OOVs (LanguageModelIRST.cpp)
- some more checks on config file: if specified, existence of generation
  and distortion files is checked (Parameter.cpp)
- 0 valued entries in binary phrase tables are loaded as 1.0e-38
  (PhraseDictionaryTree.cpp)



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1189 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-08 13:48:43 +00:00
phkoehn
de9a5e96dd look for gziped generation file, if basefile does not exist,
this should be done for all model files (lm, phrase table, reordering table)


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1183 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-07 19:02:42 +00:00
hieuhoang1972
3d7da64118 delete old kdevelop templates folder
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1182 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-06 22:39:36 +00:00
hieuhoang1972
6d217c5dda tweaked function to add hypo to stack
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1180 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-06 22:20:41 +00:00
hieuhoang1972
f9ba4c910e ensure factor Id are completely contiguous. save mem in language model
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1171 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-05 13:24:34 +00:00
hieuhoang1972
96e1d67e8a assert some assumptions implicitly made
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1170 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-05 13:02:44 +00:00
hieuhoang1972
9cf7257d27 don't skip empty lines
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1169 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-05 10:24:01 +00:00
hieuhoang1972
7168a3264c speed up Word comparison
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1150 1f5c12ca-751b-0410-a591-d2e778427230
2007-01-25 16:47:30 +00:00
hieuhoang1972
07ed9e4c7a fixed prune arcList bug
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1149 1f5c12ca-751b-0410-a591-d2e778427230
2007-01-24 21:23:38 +00:00
hieuhoang1972
46e01f9f46 unitiliased variable
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1148 1f5c12ca-751b-0410-a591-d2e778427230
2007-01-24 20:54:37 +00:00
hieuhoang1972
e7838339bb exit/return from main()
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1141 1f5c12ca-751b-0410-a591-d2e778427230
2007-01-22 16:21:28 +00:00
nicolabertoldi
10074274b9 add n-best-factor
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1140 1f5c12ca-751b-0410-a591-d2e778427230
2007-01-22 14:57:23 +00:00
nicolabertoldi
be039bf84a Reduced time to die!!!
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1139 1f5c12ca-751b-0410-a591-d2e778427230
2007-01-22 14:47:13 +00:00