Commit Graph

1674 Commits

Author SHA1 Message Date
leven101
118a11497a Added suffix array phrase dictionary
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2890 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-12 11:05:43 +00:00
chardmeier
7117fef52a Fixed dependency on previous scores for lexical reordering states.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2889 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-11 17:08:24 +00:00
bhaddow
2396f6d9e9 Fix moses mt compile error
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2886 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-11 10:44:11 +00:00
sarst
e812b2a8e3 Bugfixes: printing the correct phrases in the table, and fixed misspelling of monotonicity.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2885 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-11 09:09:44 +00:00
sarst
92368ba490 Rewrote the lexical reordering model scoring in C++. Adapted train-factored-phrase-model.perl to that change. Minor fixes in other places, for compatibility
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2884 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-10 17:19:06 +00:00
abarun
20216dbfec Nbest MBR fix
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2879 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-09 11:37:33 +00:00
abarun
27c3b6c182 Disabled weight-file option - broken
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2873 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-08 17:34:14 +00:00
mphi
9e8352a041 modified the implementation, removing unnecessary repetition, thus making the whole process approximately fifty times faster
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2866 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-07 09:11:09 +00:00
mphi
05e21dc5e2 fixed compatibility of the --final-alignment-model 2 switch and mgiza (ibm2 is done in a single thread, hence no *.part* files)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2859 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-04 16:12:57 +00:00
abarun
4ac95fb82a Can now set thetas for Lattice MBR in terms of p and r (see Tromble et al. 08)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2855 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 19:46:35 +00:00
abarun
b4b7a71c46 Set appropriate verbose level for Lattice MBR
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2854 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 17:30:07 +00:00
abarun
904a982e05 Fixed bug in nbest MBR - now outputting correct MBR solution
Fixed bug in lattice MBR - now outputting correct MBR solution 


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2853 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 17:04:05 +00:00
bojar
ff05e5a1b5 list frequent mismatched tokenizations first
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2852 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 16:37:08 +00:00
bojar
9b10946f10 fixed regexes to read current -osg format
verbose at bad lines


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2850 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:35:21 +00:00
bojar
8891069ed8 safer extraction of job id
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2849 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:24:08 +00:00
bojar
a9c2069720 switching to bash, avoiding csh's >& redirection (not accepted by dash)
making --robust take the maximum number of restarts


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2848 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:23:53 +00:00
bojar
011840bf65 adding --force-factored-filenames to avoid problems with eager removal of '0-0'
Conflicts:

	scripts/training/train-factored-phrase-model.perl


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2847 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:23:35 +00:00
bojar
dbfe610546 uppercasing first letter even if after punct
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2846 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:23:20 +00:00
bojar
594e5e8acd adding a handy script for suspicious tokenization
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2845 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 14:23:06 +00:00
chardmeier
9adc3ee500 FFState ordering uses hypothesis ids now.
Changed order of scores to match latest trunk (exponential distortion next to lexical reordering).


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2844 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 13:10:31 +00:00
abarun
117f5ef329 Lattice MBR now uses nbest list as default hypothesis set during reranking.
To use lattice instead as hypothesis set, run with -lattice-hypo-set option


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2843 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 11:20:20 +00:00
abarun
6d7a710beb Added LatticeMBR to moses-cmd/Makefile.am
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2842 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 10:30:35 +00:00
abarun
32a58a3ec9 Implemented Lattice MBR for Moses.
Call using -lmbr option.
Specify thetas using -lmbr-thetas and lattice pruning factor (edge density) using -lmbr-pruning-factor
Currently only supports nbest-list as hypothesis set (specify using -nbest-hypo-set)


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2841 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-03 10:23:32 +00:00
pasmargo
275c06d9e7 Kneser-Ney and modified Kneser-Ney smoothing implementation.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2837 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-02 18:14:01 +00:00
pasmargo
63bdf3b602 reverted to previous version due to a couple of mistakes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2836 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-02 16:51:14 +00:00
pasmargo
71b15ae0d4 Kneser-Ney and modified Kneser-Ney smoothing implementation.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2835 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-02 16:28:56 +00:00
bhaddow
aa961a1c15 Fix compile error
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2824 1f5c12ca-751b-0410-a591-d2e778427230
2010-02-01 11:51:49 +00:00
naditomeh
242d6c6ddd word-based, phrase-based and hierarchical reordering is implemented in the training
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2823 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-31 23:56:45 +00:00
sanmarf
c37f2dc14e 4th MT marathon - lexical decomposition
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2805 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-30 10:11:35 +00:00
sarst
0785c09d03 bugfix for fatored-training config
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2804 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-30 09:34:55 +00:00
sarst
f2a5678541 added new file hierarchical.h
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2803 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-30 09:16:27 +00:00
sarst
6a13b8d186 bugfixes in train-factored-phrase-model*.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2801 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-30 00:09:45 +00:00
chardmeier
be66463f5a Changed 'word' reordering type to 'wbe'. Added explanatory comments.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2800 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 23:58:07 +00:00
chardmeier
177277b7de Fixed score order in MSLR model.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2799 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 23:46:34 +00:00
chardmeier
1e076d51e7 Cleaned up forward reordering model.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2798 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 23:08:33 +00:00
chardmeier
2375365cf6 Fixed forward model.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2797 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 22:47:36 +00:00
sarst
bf70dd4767 subimtted working scripts for hierarchical training (msd)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2796 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 22:38:18 +00:00
bojar
55e3ee4a30 just setting the executable bit
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2795 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 19:49:37 +00:00
bojar
2097e45edd a handy script for calculating out-of-vocabulary rate of n-grams
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2794 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 19:48:29 +00:00
phkoehn
d606f541e4 die fast
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2784 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 17:44:28 +00:00
bojar
9e7c97ddbc fixed a minor bug introduced by previous commit
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2781 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 17:11:34 +00:00
bojar
c12c8d85dd a bit more verbose about when and how to use -constraint
Conflicts:


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2780 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 17:11:17 +00:00
naditomeh
ad3b0760b2 adding extract.cpp
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2770 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 13:59:04 +00:00
naditomeh
03de8a99d8 adding extract.cpp
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2769 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 13:41:34 +00:00
sarst
4eec020d5b bugfixes to train-factored-phrase-model.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2764 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 12:11:10 +00:00
sarst
766a0f95c0 allowing word in configuration of lexical reordering
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2761 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 11:37:12 +00:00
chardmeier
c73d90c911 Removed debugging instrumentation.
Changed score producer name to look more like the others.

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2760 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 11:33:28 +00:00
sarst
a9ef19edf0 updated train-factored-phrase-model.perl to work with the new hierarchical reordering framework
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2759 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 11:29:39 +00:00
chardmeier
c8674d45c0 Added comparison to self in FFStateArray::Compare.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/hierarchical-reo@2756 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 11:05:41 +00:00
bojar
9f784c6bf8 a handy script to get many translations from Google (can continue interrupted
sessions)


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2744 1f5c12ca-751b-0410-a591-d2e778427230
2010-01-29 01:48:13 +00:00