Commit Graph

1867 Commits

Author SHA1 Message Date
Matthias Huck
8025cbf350 Merge branch 'master' of https://github.com/moses-smt/mosesdecoder 2015-02-16 15:10:15 +00:00
Barry Haddow
34b139e2ae Remove debug 2015-02-13 12:14:18 +00:00
Phil Williams
92a21f9d3a train-model.perl: fix "argument isn't numeric" warning 2015-02-13 11:55:39 +00:00
Phil Williams
7e54e23fe2 Update transliteration scripts to use the on-disk phrase table
The scripts now use CreateOnDiskPt instead of processPhraseTable (which
is no longer supported and was removed by commit f3a84fc01).
2015-02-13 11:36:16 +00:00
Kenneth Heafield
ee39fdbaa5 Relative path 2015-02-10 10:43:10 -05:00
Charley C
e40606d08f default path update in train-recaser 2015-02-09 18:36:31 -05:00
Matthias Huck
53ce063214 tuneable-components config parameter for feature functions 2015-02-09 13:52:05 +00:00
Philipp Koehn
f69c1dab02 more efficient default recaser training 2015-02-04 09:18:09 +00:00
Hieu Hoang
78f79632b9 script to convert moses.ini v2 to v1 /Tom Hoar 2015-02-03 10:59:38 +00:00
Kenneth Heafield
925565a0b9 "just put it in. I'll verify it if i can be bovvered" --Hieu /usr/bin/env 2015-01-29 18:37:05 -05:00
Matthias Huck
449d9b294b Revert "env perl shebang"
This reverts commit 34f2801f8a.

Caused problems because /bin/env doesn't exist on Ubuntu 12.04.
/usr/bin/env does, though.
2015-01-29 21:15:20 +00:00
Kenneth Heafield
34f2801f8a env perl shebang 2015-01-27 18:35:54 -05:00
XapaJIaMnu
6ca1a4718c Expose learning rate as a parameter 2015-01-25 02:13:47 +00:00
Matthias Huck
9987beb453 SoftSourceSyntacticConstraintsFeature: Now for both non-terminals (as before) _and_ terminals.
Also added score components based on relative frequency.
(TODO: logprobs right now; are plain probabilities better?)
2015-01-23 18:41:18 +00:00
Hieu Hoang
59c4baec3f use utf8 german model 2015-01-22 16:10:12 +00:00
Kenneth Heafield
7c507bfa74 May is not an abbreviation 2015-01-19 16:37:57 -05:00
Hieu Hoang
30e31d4a95 don't normalise quotes if tokenizing like Penn /Phil Williams 2015-01-16 12:34:22 +00:00
Hieu Hoang
19d7c44aad move normalisation of quotes into normalize-punctuation.perl /Tom Hoar 2015-01-16 11:37:31 +00:00
Hieu Hoang
6d61db28fa use astyle 2.01. It's on Edinburgh server and doesn't screw up enum 2015-01-14 19:21:11 +00:00
Hieu Hoang
90d4b2d713 use pigz rather than gzip if it exists 2015-01-13 15:16:22 +00:00
Hieu Hoang
6186262a3b don't use processPhraseTable in EMS 2015-01-12 12:43:51 +00:00
Hieu Hoang
a8d4b81e71 Revert "Update train-model.perl"
This reverts commit e1e14a91ee.
2015-01-08 16:07:40 +00:00
Hieu Hoang
5336598734 beatify 2015-01-08 08:29:56 +00:00
Philipp Koehn
0441fd6ab9 added informative error message when trying to build a lexicalized reordering model with hierarchical model 2015-01-06 18:46:02 +00:00
Hieu Hoang
0a707597d8 Revert "Added error message on experiment.meta for the filter step 'No phrases in'"
This reverts commit 2105423626.
2015-01-03 21:58:15 +05:30
Eleftherios Avramidis
2105423626 Added error message on experiment.meta for the filter step 'No phrases in' 2014-12-28 18:09:33 +01:00
Philipp Koehn
59fdb3d99c same spec for dedicated script as for train-model.perl and filter-model-given-input.pl 2014-12-21 01:37:05 +00:00
Philipp Koehn
831f947874 long overdue feature: do not produce very low scoring translation table entries that are never used and just gum up the works 2014-12-21 01:14:42 +00:00
Rico Sennrich
67e101b07a Revert "Update train-model.perl"
This reverts commit 41f06a01c0.
2014-12-17 17:51:02 +00:00
Rico Sennrich
685f18ca1b documentation/readability 2014-12-16 17:42:17 +00:00
Nicola Bertoldi
d0cddf0f2d Merge branch 'master' of https://github.com/moses-smt/mosesdecoder 2014-12-16 17:35:47 +01:00
Nicola Bertoldi
4e77665d30 better handling of cache-based models with inconsistent parameters 2014-12-15 17:42:41 +01:00
Xiang Li
41f06a01c0 Update train-model.perl
If the final alignment model is model 3-5, the hmm model will be trained.
2014-12-16 00:37:15 +08:00
Nicola Bertoldi
e4eb201c52 merged master into dynamic-models and solved conflicts 2014-12-13 12:52:47 +01:00
Hieu Hoang
5ae5a630a6 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2014-12-12 10:04:58 +00:00
Kenneth Heafield
8bbccd441a Fix #85 by changing the default LM. Hieu said it's ok in the issue. 2014-12-11 23:51:48 -05:00
Hieu Hoang
c48a3aadc1 chmod 2014-12-11 16:54:19 +00:00
Hieu Hoang
765d8d1350 Merge pull request #83 from lixiangnlp/patch-1
Update train-model.perl
2014-12-10 15:48:35 +00:00
Phil Williams
1353aa57dc experiment.meta: fixes for $input-parse-relaxer 2014-12-08 16:26:08 +00:00
Phil Williams
60e56efc6b phrase-extract: add syntax-common sub-library
And remove some (near-)duplicate code from pcfg-common and score-stsg.
2014-12-07 14:27:51 +00:00
Kenneth Heafield
f97ed79a70 Month abbreviations shouldn't be causing a sentence split.
Yes this will break existing tokenized data :-(.
2014-12-05 03:41:01 -05:00
Philipp Koehn
9d55ce13c0 change for thot integration 2014-12-02 14:05:56 -05:00
Xiang Li
e1e14a91ee Update train-model.perl
The default hmm iterations of GIZA++ is 5. Even though the "hmm-align" option is not set. The hmm align is also activated when using the training script.
2014-12-01 11:26:53 +08:00
Rico Sennrich
4ca730a67c improve bilingualLM alignment heuristics consistency 2014-11-26 10:32:41 +00:00
Rico Sennrich
ee759bfede move bilingual-lm training scripts 2014-11-26 10:32:37 +00:00
Tomáš Musil
4cb81e3093 lmtype now preferred as symbolic name 2014-11-24 12:20:36 +01:00
Hieu Hoang
c0be182bfa makemteval and small change to tokenizer. /Tom Hoar and Tomas Fulajtar 2014-11-21 13:55:13 +00:00
XapaJIaMnu
52c520c042 Resolve merge conflicts 2014-11-20 15:50:32 +00:00
Hieu Hoang
e27f6b0120 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2014-11-15 14:32:49 +00:00
Hieu Hoang
67ad197d5a take out PYTHONIOENCODING=utf-8. Rely on Rico's python changes 2014-11-15 14:32:31 +00:00