Commit Graph

94 Commits

Author SHA1 Message Date
phkoehn
41ee7f69a2 adapted mert to work with multiple decoding paths
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1293 1f5c12ca-751b-0410-a591-d2e778427230
2007-03-09 17:58:05 +00:00
maurocettolo
5439a7796d Fixed a minor bug in mert-moses.pl regarding sanity checks for specified lambda triples
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1254 1f5c12ca-751b-0410-a591-d2e778427230
2007-03-01 13:02:22 +00:00
jorcisai
ad28cee802 distortion filename was incorrectly written into moses.ini file in step 9
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1243 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-26 14:36:53 +00:00
jorcisai
d5b4565f23 language model parser for --lm option is now again able to parse $type, but it is backward compatible
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1238 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-23 17:32:51 +00:00
jorcisai
c69bd4079b reordering model was left in the local directory instead of model directory
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1236 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-23 12:27:50 +00:00
jorcisai
872f2d3612 Trying to parse $type in --lm option, but not available. So we just need to parse three tokens.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1235 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-23 11:49:54 +00:00
phkoehn
6c5cb3a6ec changes to fit with edinburgh setup, added switch -generation-type: "single" only produces one probability, not both
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1231 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-22 19:37:11 +00:00
phkoehn
9f227aa26b minor bug fix with config file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1224 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-19 18:35:51 +00:00
phkoehn
14839768c8 a large number of changes. besides little tweaks:
* training script now has proper default behaviour for single-factor models, 
* mert script has better handling of default lambda parameters that now
  works with lexicalized reordering models, and also with multiple 
  models files (e.g. multiple language models)
* parallel mert script is more robust when single jobs fail: detects it
  and resubmits the crashed (or killed) jobs
* recaser added that builds on moses
* filtering script added that also binarizes filtered model files
  (this will be eventually replaced when the lexicalized reordering
  model also uses the binary format)


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1210 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-13 19:22:35 +00:00
hieuhoang1972
970af347e4 Bug fixed for distortion model proposed by Tim Murray
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1111 1f5c12ca-751b-0410-a591-d2e778427230
2007-01-07 12:36:18 +00:00
hieuhoang1972
0aba61ca8b don't insist on using python 2.3
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1104 1f5c12ca-751b-0410-a591-d2e778427230
2007-01-02 12:54:00 +00:00
lexi_birch
93937b529d Making remaining scripts os independent re pawd/pwd
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1103 1f5c12ca-751b-0410-a591-d2e778427230
2006-12-29 13:45:21 +00:00
nicolabertoldi
26aff6ead9 managing of pwd/pawd
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1101 1f5c12ca-751b-0410-a591-d2e778427230
2006-12-29 13:19:21 +00:00
lexi_birch
dee506806f Fix for mount bug using pwd on terabyte
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1096 1f5c12ca-751b-0410-a591-d2e778427230
2006-12-28 22:17:06 +00:00
hieuhoang1972
e701f57f07 halcion days of the jhu workshop are over and grim reality has taken hold.
default qsub not to use workshop specific queue

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1079 1f5c12ca-751b-0410-a591-d2e778427230
2006-12-16 22:33:55 +00:00
bojar
72ff1f8450 added yet another combiner for factored corpora
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1026 1f5c12ca-751b-0410-a591-d2e778427230
2006-11-30 06:17:45 +00:00
bojar
412f04737c allows reducing factors from stdin
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1025 1f5c12ca-751b-0410-a591-d2e778427230
2006-11-30 03:46:21 +00:00
phkoehn
0a088dbb38 fixed error in filtering for lexicalized reordering tables
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@998 1f5c12ca-751b-0410-a591-d2e778427230
2006-11-22 15:50:20 +00:00
phkoehn
28ca9b57fd minor bug fixes for training and using lexicalized reordering
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@978 1f5c12ca-751b-0410-a591-d2e778427230
2006-11-15 17:04:19 +00:00
bojar
e2518b7799 - support for sun grid engine prior to v.6.0 in qsubwrapper and mert-moses
- changed temporary scripts to csh (because my sge runs them in csh regardless of my wishes)
- added a two tests + sample data for the full chain: train-mert-decode-eval
  (a parallel and a serial version)
- cleanup of other tests
- Makefile rules for running single tests in foreground or background


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@899 1f5c12ca-751b-0410-a591-d2e778427230
2006-10-18 10:36:39 +00:00
nicolabertoldi
605e47d978 psyco library is maintained only for 386-compatible processors.
I modified score-nbest.py to import psyco only if $MACHTYPE is equal to "i386"
If MACHTYPE does not matchpsyco library is not imported,
but script works properly.
I do not know if the control is efffective under Windows


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@879 1f5c12ca-751b-0410-a591-d2e778427230
2006-10-12 09:49:09 +00:00
nicolabertoldi
24d3ed8a37 Changes to mert-moses.pl:
- added a flag (--no-filter-phrase-table) to disallow filtering of phrase tables (useful if binary hrase tables are used)
- added a flag to compute bleu score without text normalization (--nonorm) (default is with normalization)
- added a flag to compute bleu score with the "closest reference length" (--closest), which is 
   alternative to "average reference length" (--average) or "shortest reference length" (default)
- added a parameter (--inputype=[0|1]) to manage different input types (0 for text, 1 for confusion network, default is 0)
Changes to moses-parallel.pl:
- corrected a typos


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@878 1f5c12ca-751b-0410-a591-d2e778427230
2006-10-11 16:58:30 +00:00
bojar
d90b1d348e reuses lexical translation
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@876 1f5c12ca-751b-0410-a591-d2e778427230
2006-10-10 13:53:45 +00:00
bojar
2eb05906aa skips giza if older output reusable
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@872 1f5c12ca-751b-0410-a591-d2e778427230
2006-10-09 15:01:27 +00:00
bojar
998a8216ba skips mkcls and some other steps, if already finished
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@870 1f5c12ca-751b-0410-a591-d2e778427230
2006-10-06 21:19:29 +00:00
nicolabertoldi
a73c412b88 added clean to some Makefiles
use of "make clean" in scripts/Makefile



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@867 1f5c12ca-751b-0410-a591-d2e778427230
2006-10-06 15:07:30 +00:00
bojar
c8f5e2aeba fixed an error message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@866 1f5c12ca-751b-0410-a591-d2e778427230
2006-10-06 10:47:16 +00:00
phkoehn
a71f247596 bugfix: option rootdir misnamed roodir
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@835 1f5c12ca-751b-0410-a591-d2e778427230
2006-09-28 19:25:08 +00:00
mfederico
ef42ad791e symal.cpp: just a minor change
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@833 1f5c12ca-751b-0410-a591-d2e778427230
2006-09-28 16:14:21 +00:00
bojar
271b78d94c Just checking if I can commit. Added my name.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@820 1f5c12ca-751b-0410-a591-d2e778427230
2006-09-20 14:20:17 +00:00
redpony
9582bcecff turn on O3 optimization for symal
increase MAX_WORD in symal.cpp (I was hitting this limit in a chinese corpus that had some tokenization errors)



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@816 1f5c12ca-751b-0410-a591-d2e778427230
2006-09-18 15:36:04 +00:00
redpony
da7fed9e7e add --corpus-compression [gz|bz2] to allow corpora to be compressed
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@814 1f5c12ca-751b-0410-a591-d2e778427230
2006-09-15 12:38:13 +00:00
redpony
7d50d155dc fix compilation error on gcc 4.1, fix warnings in mert
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@813 1f5c12ca-751b-0410-a591-d2e778427230
2006-09-12 19:46:16 +00:00
redpony
c69cfaf33e Allow the factor delimiter, that is, the string that separates the factors in a 'word' to be specified to moses and to train-factored-phrase-model.perl. The default is still to use '|'. Multi-character delimiters are allowed (for example, '+++'). Added a regression test for multi-character delimiters.
Remove JHU dependencies on make release.  It now looks for GIZA++ and sets the BINDIR inside train-factored-phrase-model.perl at installation time (note: because of this, this script MUST BE released before it can be run now).



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@812 1f5c12ca-751b-0410-a591-d2e778427230
2006-09-12 15:53:50 +00:00
phkoehn
572c577ef7 initial release
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@806 1f5c12ca-751b-0410-a591-d2e778427230
2006-09-01 18:02:56 +00:00
nicolabertoldi
041e6ed3c5 Changes to compilation scripts:
- irstlm/src/Makefile.am did not install some files
- irstlm/mkinstalldirs needed by OSX
- irstlm/regenerate-makefiles.sh substitutes 
  explicit calls of aclocal, autoconf and automake

Changes to scoring script used by MERT
- added the option ("-e") to compute BLEU wrt the
  "closest" reference length like in multi-bleu.perl
- now multi-bleu.perl manages 0 counts for ngram-statistics




git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@805 1f5c12ca-751b-0410-a591-d2e778427230
2006-09-01 14:54:41 +00:00
bojar
53bbbbfa22 --continue now also attempts to step one extra step back if necessary moses output is not found
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@754 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-15 15:16:28 +00:00
bojar
568cff8e34 fixed serious stupid bug: value ranges were ignored and min. and max were set to the starting value
this bug occurred only if lambdas were supplied on command line, not with the default lambdas and ranges


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@753 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-15 15:04:19 +00:00
bojar
5c2d19a156 reversed exit codes of symal and added safesystem to call symal
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@730 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-14 17:54:11 +00:00
bojar
7735bc6b6d the python compiled files should not be in the cvs
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@729 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-14 17:53:23 +00:00
mfederico
f211a2a738 New version with c++ module (symal) performing step (3).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@728 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-14 17:43:17 +00:00
mfederico
6d6ac5c1e4 New version with faster computation of word alignments.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@718 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-14 15:18:12 +00:00
mfederico
c3ea1ef545 Filter to make GIZA++ alignment files more readable.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@717 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-14 15:06:58 +00:00
mfederico
e72010d6ce A tool to compute symmetric alignments from GIZA++ alignments.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@716 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-14 15:03:49 +00:00
bojar
840441dc1a die if phrase mismatch discovered
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@688 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-13 05:35:02 +00:00
bojar
6fc349f75f gives nice overview of model complexity (in terms of ambiguity in translation and generation tables)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@670 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-11 22:33:48 +00:00
bojar
e1936af681 marking finished_step also after last iteration finished
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@655 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-11 17:05:04 +00:00
bojar
68ef1413cd allows arbitrary mixing of 'kept' and 'added' factors in output
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@627 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-10 22:00:02 +00:00
bojar
9b23b6d9c8 die in safesystem on child's death
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@612 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-10 01:42:49 +00:00
bojar
af1be61259 die when there are no phrases in input
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@611 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-10 01:37:29 +00:00