Commit Graph

12932 Commits

Author SHA1 Message Date
Rico Sennrich
6748f4106f compatibility with NPLM 0.3 (breaks earlier versions of NPLM) 2014-11-17 18:08:42 +00:00
XapaJIaMnu
a343837095 Add option to choose activation function during nplm training 2014-11-15 11:54:47 +00:00
Paul Baltescu
35401f7dd3 Merge branch 'bilingual-lm' of github.com:moses-smt/mosesdecoder into bilingual-lm 2014-11-14 12:39:02 +00:00
Paul Baltescu
fefd2b0ada Fix merge errors. 2014-11-13 16:52:24 +00:00
XapaJIaMnu
2028b86642 BilingualNPLM requries specific source and target vocabulary lists 2014-11-13 16:14:17 +00:00
XapaJIaMnu
d5567b6cfb Training: Do the preparation step ourselves. No validation support yet. No decoder support yet. 2014-11-13 16:14:17 +00:00
Rico Sennrich
9b161dc888 floor score of bilingualNPLM 2014-11-13 16:14:16 +00:00
XapaJIaMnu
b254fcea99 moses/LM/BilingualLM.cpp 2014-11-13 16:14:16 +00:00
Rico Sennrich
8fd3be9e4e add EOS token </s> to each sentence 2014-11-13 16:14:16 +00:00
Rico Sennrich
f26fc251d5 sort vocab by frequency 2014-11-13 16:14:16 +00:00
XapaJIaMnu
bb70f60f67 grrr 2014-11-13 16:14:16 +00:00
XapaJIaMnu
e330ab35d5 Short option must be only one letter 2014-11-13 16:14:16 +00:00
XapaJIaMnu
a74105ea7d Fix a wrong condition 2014-11-13 16:14:16 +00:00
XapaJIaMnu
e54c171850 Make it optional to prepare the validation set 2014-11-13 16:14:16 +00:00
XapaJIaMnu
a300824bd1 Add optional validation during training 2014-11-13 16:14:16 +00:00
XapaJIaMnu
f6c64adb92 Remove unnecessary const gimmicks 2014-11-13 16:14:16 +00:00
XapaJIaMnu
18a6d12cb0 Rework lookup and greatly speedup decoding (2x+) 2014-11-13 16:14:16 +00:00
XapaJIaMnu
cf3fe60cf6 Backoff to null token in hiero 2014-11-13 16:14:16 +00:00
XapaJIaMnu
06fa9b5916 Add option to build with Og for newer versions of gcc 2014-11-13 16:14:16 +00:00
XapaJIaMnu
493f05eb97 Remove old unused code from BilingualLM 2014-11-13 16:14:16 +00:00
Paul Baltescu
211878c9d3 Make OxLM extensions compile. 2014-11-13 16:14:16 +00:00
XapaJIaMnu
8e4ff790a7 Add null word support for hiero. 2014-11-13 16:14:16 +00:00
Paul Baltescu
0578a31799 Flag for setting OxLM models to unnormalized. 2014-11-13 16:14:16 +00:00
XapaJIaMnu
0451142ece Add null token normalization for models to be used with the chart decoder. 2014-11-13 16:13:38 +00:00
XapaJIaMnu
aae894fe6b Add null token in vocabulary during construction 2014-11-13 16:13:38 +00:00
XapaJIaMnu
b4f51c05d1 Add option to reduce the ngrams from already prepared .ngrams file to train a model with smaller number of ngrams 2014-11-13 16:13:38 +00:00
XapaJIaMnu
5851a2c2bb Prevent realocation of vectors 2014-11-13 16:13:38 +00:00
Paul Baltescu
2705a47876 Fix OxLM. 2014-11-13 16:13:38 +00:00
XapaJIaMnu
02c375ef78 Refactor the BilingualLM for chart to make it faster. Untested 2014-11-13 16:13:04 +00:00
XapaJIaMnu
7858f74e9e Rename BilingualLM_NPLM so that it is not confused with a sparse feature 2014-11-13 16:13:04 +00:00
Paul Baltescu
61826cee8a Rename OxLM features. 2014-11-13 16:13:04 +00:00
Paul Baltescu
32c169c25f Optional back-off to POS tags in OxLM. 2014-11-13 16:12:19 +00:00
Paul Baltescu
86d64b65e2 Correctly map the source unknown token. 2014-11-13 16:10:40 +00:00
Paul Baltescu
167e272818 Fix Bilingual OxLM context word order. 2014-11-13 16:10:40 +00:00
Paul Baltescu
5f9d481ee6 Make query cache sentence specific. 2014-11-13 16:10:40 +00:00
Paul Baltescu
248aa4bf8a Convert moses words to oxlm word ids. 2014-11-13 16:10:40 +00:00
Paul Baltescu
97b632e045 Clean up OxLMMapper. 2014-11-13 16:08:56 +00:00
Paul Baltescu
90ebf13789 Set BilingualLM parameters nicely. 2014-11-13 16:08:10 +00:00
Paul Baltescu
7588c4b8e3 Skeleton for source conditioned OxLM feature. 2014-11-13 16:08:10 +00:00
Paul Baltescu
6f9d59129f Rename LBLLM -> OxLM. 2014-11-13 16:07:38 +00:00
Paul Baltescu
af28063e3b Fix compilation errors introduced by new oxlm changes. 2014-11-13 15:54:54 +00:00
Paul Baltescu
4811701277 Fix broken include. 2014-11-13 15:51:48 +00:00
Paul Baltescu
cb7167f088 Fix bugs in BilingualLM for chart based decoding. 2014-11-13 15:51:48 +00:00
Paul Baltescu
3624bd776c Fix a few bugs in BilingualLM for phrase based decoding. 2014-11-13 15:51:48 +00:00
Paul Baltescu
5f87cf94d8 Move BilingualLM under LM. 2014-11-13 15:51:48 +00:00
XapaJIaMnu
fbac0ae418 Make sure we always have unk in the vocabulary, otherwise we get off-by-one indexes during decoding 2014-11-13 15:51:48 +00:00
XapaJIaMnu
961578286f Forgot to close a file... 2014-11-13 15:51:48 +00:00
XapaJIaMnu
1bac666e5f Fix small oversights 2014-11-13 15:51:48 +00:00
XapaJIaMnu
617ef015df Extend train_nplm with various options 2014-11-13 15:51:48 +00:00
XapaJIaMnu
a1a10a9209 Remove unused variable that likely causes crashes 2014-11-13 15:51:48 +00:00