Commit Graph

618 Commits

Author SHA1 Message Date
chesio
e402ac25c5 Eppex is now properly released + reinjected into train-model.perl params.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4265 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-25 12:06:15 +00:00
hieuhoang1972
d2245390e0 visual studio build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4250 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-22 05:39:32 +00:00
hieuhoang1972
08805d6a9c comment revert
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4246 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-21 11:04:48 +00:00
hieuhoang1972
659e34735d xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4245 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-21 11:01:19 +00:00
pjwilliams
a064f799e0 Add scripts/analysis/extract-target-trees.py
Usage: extract-target-trees.py [FILE]

Reads moses-chart's -T output from FILE or standard input and writes trees to
standard output in Moses' XML tree format.

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4233 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-19 09:08:24 +00:00
bhaddow
2cf26266f7 debug should be off by default
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4232 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-18 21:23:32 +00:00
hieuhoang1972
4313e335b5 print out span widths of non-terms. Extra argument --OutputNTLengths
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4230 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-16 17:13:34 +00:00
bhaddow
4d5b17f444 Option to create extract file with sentence ids
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4229 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-16 15:37:02 +00:00
phkoehn
0ab0df5aac Hopkins&May Interpolation, bugfix for sparse features
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4227 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-16 11:55:49 +00:00
bhaddow
fc695c38a7 Implementation of sharding and resampling in mert.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4226 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-15 17:45:35 +00:00
hieuhoang1972
1e1eb4d29e print out span widths of non-terms. Extra argument --OutputNTLengths
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4225 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-14 18:16:05 +00:00
hieuhoang1972
149208ecba print out span widths of non-terms. Extra argument --OutputNTLengths
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4224 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-14 10:23:14 +00:00
hieuhoang1972
d68274d217 print out span widths of non-terms. Extra argument --OutputNTLengths
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4223 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-14 07:15:36 +00:00
hieuhoang1972
b1ca5e1fc8 xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4222 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-13 19:01:04 +00:00
hieuhoang1972
b8606b3e70 xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4221 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-13 18:51:28 +00:00
bhaddow
2c585ce6e7 restore
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4186 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-07 16:42:46 +00:00
bhaddow
de51b69d03 remove (temporarily)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4185 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-07 16:40:55 +00:00
phkoehn
41a1849437 support for sparse feature functions (mert support only when using PRO)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4184 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-07 16:37:33 +00:00
bhaddow
ca5c0f19b7 Multi-threading of mert, for random restarts.
Fix mert tests.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4182 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-07 08:08:35 +00:00
bojar
89c100ea83 revamp of mert-moses.pl (got rid of 'triples', relying on moses' -show-weights)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4149 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-17 09:15:19 +00:00
bojar
6e23604e7c removing old comments
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4148 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-17 09:04:49 +00:00
bojar
42ccbcc995 merged updates up to r4132 inclusive
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4147 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-17 09:02:45 +00:00
chesio
22da5782f3 Option to use --eppex added to train-model.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4143 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-14 17:42:12 +00:00
chesio
9f8fc06a2b Integration of eppex into scripts Makefile (similarly to memscore, fail of build won't stop the compilation).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4142 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-14 11:48:50 +00:00
chesio
27bb28885e Eppex: config.h renamed to typedefs.h (preparing for project autoconf-iguration).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4141 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-14 10:40:16 +00:00
chesio
019574fc61 Added eppex - an alternative to extract component of phrase-extract.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4138 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-10 09:13:29 +00:00
bgottesman
24f5bf6723 when detokenizing, remove whitespace between a pair of CJK (Chinese/Japanese/Korean) words
This gets the Chinese and Japanese tests working, so remove the failure expectation.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4134 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 15:30:54 +00:00
bgottesman
14587cdafc fix a detokenization bug that was preventing the removal of the whitespace following a contracted French or Italian article/pronoun (e.g. "l' immigration") when the contraction was the second-last word in the segment
remove the expectation of failure on the corresponding unit test


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4133 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 15:02:56 +00:00
rsennrich
79142d18e6 replace hard-coded path with variable
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4132 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 14:24:09 +00:00
hieuhoang1972
30ca534b86 faster scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4119 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 10:27:15 +00:00
bojar
779873a2a2 merged Philipp's updates up to r4106 inclusive
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4115 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 23:05:18 +00:00
bojar
7a301a7b5a negligible polishing
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4114 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 22:24:35 +00:00
hieuhoang1972
cdbb850cc3 fix new scorer to output phrase pairs in same order as old scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4110 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 07:36:25 +00:00
phkoehn
36db0ffe48 added pairwise ranked optimization (PRO) as proposd by [Hopkins&May,2011], just use switch --pairwise-ranked
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4106 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 17:00:17 +00:00
hieuhoang1972
65f7ffb783 delete debug message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4096 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:58:05 +00:00
hieuhoang1972
59771e8dbe delete debug message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4095 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:27:06 +00:00
hieuhoang1972
677378774a optmised version of score program. Original version is slow when source phrase has many target phrases 'cos it scans a large vector. New version puts it into a set. Slight hack in that it const_cast to get items out of the set. For a source with 100k targets, took 1.2sec, versus 2m20sec. Current version can can take days to run. Won't make it the main score program until regression test for score is set up
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4093 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 09:40:58 +00:00
hieuhoang1972
3b1dac4178 start on speed optimisation for scoring
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4092 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 07:55:03 +00:00
hieuhoang1972
876ad74dbd create reverse phrase table. Not ready for prime time
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4091 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-26 09:12:58 +00:00
hieuhoang1972
a79651d239 fixed backoff phrase table. Allow backoff of unigrams
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4089 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-25 12:23:49 +00:00
hieuhoang1972
068c17f368 vs.net build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4087 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-23 23:26:56 +00:00
phkoehn
1bd74fc87f added random directions [Cer&al.,2008] and historic best as starting points [Foster&Kuhn,2009] to MERT
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4086 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-23 00:24:45 +00:00
hieuhoang1972
6a27dc4f17 example of how to run
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4084 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-22 08:32:09 +00:00
chesio
4918003635 absolutize_moses_model and clone_moses_model are now aware of suffix arrays format.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4082 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-21 23:15:08 +00:00
hieuhoang1972
9c0d725cde visual studio 2010
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4079 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-19 03:07:15 +00:00
hieuhoang1972
1190b75528 consolidate gzip files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4077 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-18 05:08:26 +00:00
hieuhoang1972
e174b5dea2 fix by Nicola Bertoldi for lexical probability calculation. Previous implementation was sensitive to double spaces and spaces at the beginning of the sentence, counting a space after another space as a word.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4075 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-14 11:26:35 +00:00
bhaddow
846748fa3f A more helpful error message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4072 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-08 20:07:07 +00:00
bojar
66b71a7f5c Ondrej's little tools to examine weight settings
not quite fit for public use, esp. the -summarize.sh one...


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4071 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-08 00:11:10 +00:00
bhaddow
8ffbe2389e Fix bug in handling of the score options
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4070 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-07 16:31:16 +00:00