Commit Graph

1553 Commits

Author SHA1 Message Date
Hieu Hoang
0e46cd377c Merge branch 'master' into nadir_osm 2013-07-03 20:24:20 +01:00
Nadir Durrani
fbdb07a94c EMS 2013-07-03 10:54:38 +01:00
Nadir Durrani
82d6105f05 OSM Training Script 2013-07-02 13:59:47 +01:00
Hieu Hoang
4e4cf1e313 script to replace numbers with placeholder. /Achim Ruopp 2013-07-01 23:00:59 +01:00
Wilker Aziz
f3cd72537c Merge branch 'master' of https://github.com/moses-smt/mosesdecoder 2013-06-24 15:39:18 +01:00
Wilker Aziz
2c19238c24 Patching up the suffix array wrappers 2013-06-24 15:38:10 +01:00
Wilker Aziz
b49e6a162f Wrapper to lmplz 2013-06-24 12:20:20 +01:00
Hieu Hoang
a85f819a53 superceded 2013-06-24 11:33:11 +01:00
phikoehn
f5b8c47a2e Merge branch 'master' of ssh://github.com/moses-smt/mosesdecoder 2013-06-23 17:19:37 +01:00
phikoehn
164b06cd7e debugging 2013-06-23 17:19:22 +01:00
Hieu Hoang
dc33fa3d3d redo parsing of feature function parameters 2013-06-20 12:50:41 +01:00
Hieu Hoang
029110c245 change table-limit specification to new format 2013-06-14 10:09:06 +01:00
Hieu Hoang
abe6bb7c22 refactor parsing of feature functiona args 2013-06-10 18:11:55 +01:00
phikoehn
54f2ea07bd handle sparse features in translation table 2013-06-09 20:00:19 +01:00
phikoehn
ce372477c9 conversion script from Moses V1.0 moses.ini files to current format - may need some further tweaking 2013-06-09 14:28:56 +01:00
phikoehn
2e8fbe77a2 corrected example files 2013-06-08 14:45:55 +01:00
phikoehn
730da7edec sparse feature specification bug fix 2013-06-08 13:39:15 +01:00
Hieu Hoang
a974bbafac Merge pull request #37 from neubig/ems-interpolate-scientific
Fixed crash in interpolation for small lambdas
2013-06-07 16:24:32 -07:00
Rico Sennrich
8581fb9518 fix (minor) unicode warning, and update permissions 2013-06-03 13:48:31 +02:00
Graham Neubig
33d5aac6af Fixed crash in interpolation for small lambdas
The EMS crashed when interpolating language models when the ideal lambdas included numbers so small that they required scientific notation (eg: 1.332e-07). This patch adds "e" and "-" to the acceptable numbers to fix this problem
2013-06-01 12:37:24 +09:00
phikoehn
68501f5a36 bug fix with weight substitution 2013-05-31 12:27:35 +01:00
Hieu Hoang
f83622b0b7 figure out which feature function to apply at which decode step. Book-keeping 2013-05-30 17:16:10 +01:00
Hieu Hoang
8f7c12ef40 beautify 2013-05-29 18:19:06 +01:00
Hieu Hoang
6249432407 beautify 2013-05-29 18:16:15 +01:00
Hieu Hoang
beaf295741 beautify 2013-05-27 17:47:54 +01:00
Hieu Hoang
6996973a56 beautify 2013-05-27 17:42:27 +01:00
phikoehn
8944ea541a fast align parameter 2013-05-25 23:20:27 +01:00
phikoehn
542cd72c63 moved config creation back into train-model.perl 2013-05-19 03:28:02 +01:00
Hieu Hoang
0596ba4245 carry [weight-file] from tuned ini 2013-05-17 18:23:55 +01:00
Hieu Hoang
11632e298e add substitute-filtered-tables-and-weights.perl for applying filter for evaluation step 2013-05-17 16:13:24 +01:00
Hieu Hoang
42c292765a add substitute-filtered-tables-and-weights.perl for applying filter for evaluation step 2013-05-17 13:28:21 +01:00
phikoehn
4cdffc8a89 fixes for sparse feature handling 2013-05-17 08:37:29 +01:00
phikoehn
13991fc88f added specification to example config files for fast align 2013-05-17 06:42:54 +01:00
Hieu Hoang
35d37a91a1 Don't print 'sparse' for sparse feature functions. All features functions can contains dense and sparse 2013-05-16 23:36:59 +01:00
Barry Haddow
585786d26b can specify location of create-ini 2013-05-16 19:34:56 +01:00
Hieu Hoang
f96a82d26c add normalize-punctuation.perl, from WMT 2013-05-16 17:03:37 +01:00
Hieu Hoang
8dd84d7a40 change integration of sparse features with EMS to account for new weights format 2013-05-16 15:38:05 +01:00
Hieu Hoang
7522f3963c change PhraseDictionaryTreeAdaptor --> PhraseDictionaryBinary 2013-05-15 09:35:28 +01:00
Barry Haddow
a4ce50f2fb Fix for cygwin 2013-05-14 08:54:29 +01:00
phikoehn
41da5b2760 Merge branch 'master' of git://github.com/moses-smt/mosesdecoder 2013-05-12 08:16:22 +01:00
Hieu Hoang
40bc98df56 filter Memory --> OnDisk and Binary 2013-05-11 21:57:03 +01:00
Hieu Hoang
51fc6fdcb6 filter Memory --> OnDisk and Binary 2013-05-11 21:46:45 +01:00
Hieu Hoang
a8f4e2c8fe changes for cruise control 2013-05-10 15:43:49 +01:00
Hieu Hoang
e2f2aff94a merged. Mostly by discarding new changes 2013-05-03 14:36:39 +01:00
Barry Haddow
8a965cd62e Fixes to binarize-all 2013-05-03 10:15:37 +01:00
Barry Haddow
8993339df4 Make sure tuning uses filtered config when available. 2013-05-02 18:50:21 +01:00
Barry Haddow
cf47ad132c Ability to specify number of conf net weights 2013-05-02 18:50:03 +01:00
Hieu Hoang
929b153216 merge 2013-05-02 17:59:36 +01:00
Barry Haddow
48fe0610ef Merge branch 'master' of github.com:moses-smt/mosesdecoder
Conflicts:
	scripts/training/filter-model-given-input.pl
2013-05-02 17:02:51 +01:00
Barry Haddow
5eebb9538e Enable skipping of filtering in EMS
Use 'binarize-all = path-to-binarize-model.perl
2013-05-02 15:15:52 +01:00
Barry Haddow
72cbadaed5 Back to type 99 2013-05-02 15:11:17 +01:00
Ondrej Bojar
0450fd6776 Merge branch 'master' of https://github.com/moses-smt/mosesdecoder 2013-05-02 00:33:52 +02:00
Ondrej Bojar
b57f5a530e allow disabling distortion model binarization 2013-05-02 00:33:09 +02:00
phikoehn
d19a28ae21 Merge branch 'master' of git://github.com/moses-smt/mosesdecoder 2013-05-01 19:22:00 +01:00
phikoehn
cd8915647b support for Chris Dyer's fast-align; bug fix with sparse word translations feature; threshold pruning in filter 2013-05-01 19:20:05 +01:00
Hieu Hoang
3ed17bbedd merge 2013-05-01 11:50:29 +01:00
Barry Haddow
5638aa6a32 don't rebuild tables when a TRAINING:config is specified 2013-05-01 11:25:32 +01:00
Ondrej Bojar
3e2b83444d gzip ttable when binarizing, use --tempdir, fixed a flushing bug 2013-05-01 11:49:58 +02:00
Barry Haddow
b38c318e84 Update to use multi model 2013-04-30 14:04:39 +01:00
Hieu Hoang
ce95c117f6 merge 2013-04-29 18:46:48 +01:00
Rico Sennrich
908c006e32 online combination of multiple phrase tables
- creates a virtual phrase table at decoding time based on a vector of component models and a combination algorithm
  - linear interpolation or instance weighting
  - two possible component model types supported so far: 0 (in-memory) or 12 (compact)
  - weights can be set in config, and overriden on a sentence-level through mosesserver API
  - online optimization (perplexity minimization) using dlib and xmlrpc-c call
2013-04-22 13:21:59 +02:00
Hieu Hoang
b1da4dbe0e merged 2013-04-19 15:03:34 +01:00
Barry Haddow
9d42c7f6f7 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2013-04-12 16:07:26 +01:00
Hieu Hoang
44a0e52e30 fixed ShowWeights() for confusion networks. This is a reason why we should get rid of ShortNames and move to refactored moses pdq 2013-04-09 14:44:32 +01:00
Hieu Hoang
e603e965a5 Merge github.com:moses-smt/mosesdecoder into weight-new 2013-04-08 17:52:27 +01:00
phikoehn
ac82be3120 Hal moved. We follow. 2013-04-03 21:59:03 +01:00
Hieu Hoang
2655300c83 Merge github.com:moses-smt/mosesdecoder into weight-new 2013-04-03 19:26:49 +01:00
Ondrej Bojar
93433cf015 support --translation-details OUTFILE in moses-parallel 2013-04-03 18:10:44 +02:00
Hieu Hoang
71a2b49a47 Merge github.com:moses-smt/mosesdecoder into weight-new 2013-04-01 16:43:32 +01:00
phikoehn
0a978e9f01 bug fixes 2013-04-01 14:31:32 +01:00
Hieu Hoang
fd4e954322 merge 2013-03-24 09:57:36 +00:00
Barry Haddow
42526b5b6e Merge branch 'master' of github.com:moses-smt/mosesdecoder 2013-03-18 21:50:11 +00:00
Barry Haddow
8efeb59228 don't lowercase reference if there's a recaser 2013-03-18 21:29:17 +00:00
Achim
038871fdb3 Hungarian and Latvian non-breaking prefix files 2013-03-18 17:17:35 -04:00
Hieu Hoang
f956eeef23 Merge github.com:moses-smt/mosesdecoder into weight-new 2013-03-18 16:50:21 +00:00
Hieu Hoang
1b83b85f44 debug info from sort command 2013-03-18 16:48:40 +00:00
Hieu Hoang
1f1a0297db runtime error in creating ini file 2013-03-16 15:05:14 +00:00
Barry Haddow
4c2e2d768b Update mert training to use interpolated ttable 2013-03-15 16:13:33 +00:00
phikoehn
3a7f4f776a minor 2013-03-13 17:54:29 +00:00
Hieu Hoang
2f78fe5fe5 Merge github.com:moses-smt/mosesdecoder into weight-new 2013-03-13 17:54:03 +00:00
Hieu Hoang
3271710dd5 tell create-ini about input factors 2013-03-11 19:31:53 +00:00
Hieu Hoang
7fb363d922 tell create-ini what factors are being used 2013-03-11 13:42:26 +00:00
Hieu Hoang
96d76ff642 correctly call substitute-weights.perl in apply-weights 2013-03-08 12:53:18 +00:00
Hieu Hoang
0869921b80 remove misc/reuse-weights.cpp 2013-03-07 16:53:50 +00:00
Hieu Hoang
0f2b2acd78 added substitute-weights.perl 2013-03-06 19:34:48 +00:00
Hieu Hoang
881787de3f Allow overriding of feature name 2013-03-06 14:04:09 +00:00
Hieu Hoang
3e60705ec2 change format of pt. Allow overriding of feature name 2013-03-06 12:39:41 +00:00
Hieu Hoang
3d6068923b Merge github.com:moses-smt/mosesdecoder into weight-new 2013-03-06 08:50:58 +00:00
Hieu Hoang
942c8423c6 change format of pt 2013-03-05 17:36:05 +00:00
Hieu Hoang
26d1914d22 change format of pt 2013-03-05 16:41:14 +00:00
Phil Williams
6fa279fadb filter-rule-table.py: change default pruning count from 1 to 0
Change the default pruning threshold from 1 to 0 to allow for
Hiero-style fractional counts.
2013-03-04 21:02:50 +00:00
Hieu Hoang
35bffae402 Merge github.com:moses-smt/mosesdecoder into weight-new 2013-03-04 18:19:28 +00:00
Christian Buck
26bf04df5d added unbuffered mode for casers (using -b) 2013-03-04 15:29:13 +00:00
Hieu Hoang
9d4f1c8555 Merge github.com:moses-smt/mosesdecoder into weight-new 2013-03-02 18:03:08 +00:00
amittai
7ca271b200 fixed typo 2013-02-26 19:47:44 -08:00
amittai
1f82a43837 where'd the edit go? 2013-02-26 11:37:31 -08:00
amittai
5cdf65ba33 Revert "Revert "let's be consistently case-insensitive with respect to the xml tags""
This reverts commit 8b6e98c633.
2013-02-26 11:32:29 -08:00
amittai
1fb51dc674 use 'gunzip -c' instead of 'zcat' for better cross-platform compatibility
zcat is identical to "gunzip -c", but Mac OS X doesn't ship with zcat.
2013-02-26 11:19:33 -08:00
Barry Haddow
ed117f55c9 Timing info. Command line args. 2013-02-25 09:36:58 +00:00
amittai
8b6e98c633 Revert "let's be consistently case-insensitive with respect to the xml tags"
This reverts commit 2eb0c5e11d.
2013-02-24 18:10:19 -08:00
amittai
2eb0c5e11d let's be consistently case-insensitive with respect to the xml tags 2013-02-24 18:07:11 -08:00
Barry Haddow
2f221473f0 Change from phrase-weighting to promix 2013-02-21 21:40:01 +00:00
Barry Haddow
51ab9aa19d Merge remote branch 'origin/master' into phrase-weighting 2013-02-21 17:34:59 +00:00
Hieu Hoang
5c10a2889e Merge github.com:moses-smt/mosesdecoder into weight-new 2013-02-20 17:02:34 +00:00
Hieu Hoang
b9b74c53ae new ini file with hiero models 2013-02-18 16:51:44 +00:00
Hieu Hoang
9fe9b0008b new ini file with hiero models 2013-02-18 12:05:10 +00:00
Barry Haddow
9ca364fb22 Implement brevity penalty smoothing for PRO
As in Nakov et al (Coling 2012)
2013-02-18 11:11:20 +00:00
Hieu Hoang
30850bf45f new substitute-filtered-tables.perl 2013-02-16 11:01:23 +00:00
Hieu Hoang
a9e2be2896 filter new ini file 2013-02-16 00:52:58 +00:00
Hieu Hoang
ee39104c3a subs join_array and set_value 2013-02-16 00:23:05 +00:00
Hieu Hoang
d659e59e9d filter new ini file 2013-02-14 22:13:39 +00:00
Hieu Hoang
f32d337fc0 filter new ini file 2013-02-14 19:02:00 +00:00
Hieu Hoang
221dba2820 filter new ini file 2013-02-14 18:13:11 +00:00
Hieu Hoang
a7144edfdc indent 2013-02-14 16:05:38 +00:00
Hieu Hoang
ae52a15c4d use new create-ini program to create ini file, rather than step 9 of train-model.perl 2013-02-14 12:07:48 +00:00
Hieu Hoang
825edd282b use new create-ini program to create ini file, rather than step 9 of train-model.perl 2013-02-14 11:19:40 +00:00
Hieu Hoang
4b37050853 resolved conflicts 2013-02-01 20:35:48 +00:00
Ales Tamchyna
2b7725db34 support LM OOV feature in train-model.perl 2013-02-01 15:47:05 +01:00
Hieu Hoang
4d32d8b64b merge 2013-01-19 18:47:04 +00:00
Tetsuo Kiso
8b4a1fa2b8 Fix MegaM's URL.
Because I got a 404 error.
2013-01-20 02:30:20 +09:00
Hieu Hoang
aadefc6df9 Merge branch 'master' into weight-new 2013-01-17 18:05:02 +00:00
amittai
176647e342 accept either "mgiza" or "mgizapp" and either "snt2cooc.out" or "snt2cooc"
Fixed a mismatch between the wiki and mgiza.

Installing mgiza produces a file called "mgiza".
However, the Moses instructions on the wiki here
http://www.statmt.org/moses/?n=Moses.ExternalTools#mgiza
insist that the "mgiza" binary be renamed "mgizapp", but then
train-experiment.perl only accepts the binary called "mgiza".
2013-01-15 19:11:49 -08:00
phikoehn
124c36a837 bug fix with MML settings 2013-01-14 19:39:26 +00:00
phikoehn
a7f7379fa4 fixed bug in detruecaser / interaction with esacping 2013-01-14 19:25:43 +00:00
phikoehn
d5cf38cab2 Merge branch 'master' of git://github.com/moses-smt/mosesdecoder 2013-01-14 19:23:02 +00:00
phikoehn
344b150372 bug fixes with escaping / truecasing interactions 2013-01-14 19:22:29 +00:00
Hieu Hoang
fa60724391 merge 2013-01-14 10:39:41 +00:00
Kenneth Heafield
c9687e3b50 Fix longstanding bug in sentence splitter spacing.
"Foo Bar.  Baz Quux." is two sentences even though there are two spaces instead of one.
2013-01-11 13:32:24 +00:00
Graham Neubig
c55a1474df Updated experiment.meta 2013-01-10 16:16:23 +09:00
Barry Haddow
936dbf6516 Instance weighting 2013-01-08 16:40:00 +00:00
Barry Haddow
c86c11abbe instance weighting of lex weights 2013-01-08 15:34:29 +00:00
Barry Haddow
a55a936182 remove warning 2013-01-08 14:28:16 +00:00
Barry Haddow
459acf87b1 Add support for instance weights file 2013-01-04 14:55:24 +00:00
Hieu Hoang
bc615bdac8 Merge branch 'master' into weight-new 2012-12-18 12:46:00 +00:00
Hieu Hoang
7ed52368a8 more minor changes in scripts for new format 2012-12-18 12:45:11 +00:00
Hieu Hoang
aa00aebee6 rewrite reuse-weight 2012-12-18 10:23:00 +00:00
Phil Williams
06081f7ddb extract-target-trees.py: minor fixes, code style 2012-12-17 18:49:50 +00:00
Hieu Hoang
667b8a495f decoder -show-weight to output consistent new format 2012-12-17 17:17:44 +00:00
hieu
6694df6519 mert-moses.perl create ini file in new format 2012-12-14 16:43:12 +00:00
hieu
801c4c4f4f mert-moses.perl create ini file in new format 2012-12-14 13:29:26 +00:00
hieu
55c46cab49 train-model.perl create ini file in new format 2012-12-14 12:32:45 +00:00
hieu
54fc153a52 mert-moses.perl completes all iterations 2012-12-13 12:04:54 +00:00
hieu
e9dbcdb87f mert-moses.perl completes the 1st iteration 2012-12-13 12:01:39 +00:00
phikoehn
b275c94dbf allow for inclusion of extract from previous run 2012-12-12 07:02:59 +00:00
phikoehn
24e1df7520 support for use of baseline alignment model 2012-12-12 03:59:14 +00:00
phikoehn
438dcb1a34 bug fix in experiment.perl wrt. get-corpus-script 2012-12-10 23:50:14 +00:00
Barry Haddow
16ea68f55f Fix bug in mml scoring
Line length calculation was out of step with LM scoring.
2012-12-10 15:54:24 +00:00
phikoehn
ed2d191821 allow specification of end point for experiment.perl 2012-12-10 05:56:51 +00:00
phikoehn
ccf9e13d8e bug fix with multicore parallelizer 2012-12-09 22:27:02 +00:00
phikoehn
466b502ae0 minor bug fixes with MML 2012-12-09 20:31:20 +00:00