Commit Graph

465 Commits

Author SHA1 Message Date
Barry Haddow
3a2116b2c9 add quotes so arguments don't get lost 2015-07-29 09:35:19 +01:00
Phil Williams
2cda286a06 experiment.meta: re-run fast_align symmetrization if symmetrization type changes 2015-07-28 16:55:55 +01:00
Rico Sennrich
a968536176 ems fix: pass-unless doesn't understand AND 2015-07-28 16:37:50 +01:00
Barry Haddow
e53ad40859 Support for nplm in ems 2015-07-23 10:37:26 +01:00
Philipp Koehn
496f8c6d85 only extract reordering phrase pairs if use mmsapt phrase table 2015-07-20 11:44:22 -04:00
Philipp Koehn
fcf2934a2f customized phrase table pruning step 2015-07-20 11:43:02 -04:00
Rico Sennrich
1b1bafb1e8 ems: add option to factorize after truecase/split/etc. 2015-07-20 10:43:23 +01:00
Philipp Koehn
66ecf98cf7 minor bug fix 2015-07-14 11:01:22 -04:00
Rico Sennrich
ca72105fdf fix ems regression 2015-07-14 13:16:25 +01:00
Philipp Koehn
7e3050f7f2 allow saving of model from fast-align (for incremental use) 2015-07-14 05:27:03 -04:00
Barry Haddow
3fdbb00904 Improvements to handling of bilingual LM in EMS 2015-07-10 15:44:24 +01:00
Hieu Hoang
f66beabf4f Generation error in EMS due to pruning. Lets see if this works. 2015-06-28 14:03:54 +04:00
Hieu Hoang
b83803203e prune generation table in ems 2015-06-25 18:10:31 +04:00
Hieu Hoang
dce0f33270 prune generation table in ems 2015-06-24 18:35:59 +04:00
Barry Haddow
ad8114ddb0 capitalisation 2015-06-15 16:23:12 +01:00
XapaJIaMnu
166bf7365f Forgot to update the weight config path 2015-06-12 16:56:36 +01:00
XapaJIaMnu
ffd3f2bb6e Added basic BilingualNPLM support to EMS and an example config. 2015-06-12 16:21:24 +01:00
Jeroen Vermeulen
85c23ed7dc Fix some JS lint. 2015-06-02 18:05:12 +07:00
Jeroen Vermeulen
0981d23705 Lint-fixing binge. 2015-06-02 16:02:39 +07:00
Jeroen Vermeulen
ef028446f3 Add license notices to scripts.
This is not pleasant to read (and much, much less pleasant to write!) but
sort of necessary in an open project.  Right now it's quite hard to figure
out what is licensed how, which doesn't matter much to most people but can
suddenly become very important when people want to know what they're being
allowed to do.

I kept the notices as short as I could.  As far as I could see, everything
without a clear license notice is LGPL v2.1 or later.
2015-05-29 18:30:26 +07:00
Rico Sennrich
f6f56d11af ems: parse-relax comes last in train; do same for dev/test 2015-05-25 15:52:07 +01:00
Rico Sennrich
98ff2382d0 duplication of existing functionality 2015-05-20 17:35:38 +01:00
Rico Sennrich
6aac7ded9a EMS: more flexible way to concatenate LM training data.
the implementation allows the user to specify which corpora to combine,
and to have multiple LMs on the same data.
2015-05-20 17:20:02 +01:00
Rico Sennrich
8ca6764c7d ems: allow LMs with user-specified training commands and moses.ini config entries
intended for neural LMs, syntactic LMs, and the like. currently doesn't play nice with INTERPOLATED-LM.
2015-05-18 19:07:37 +01:00
Rico Sennrich
fb06a2325e fix broken ems with interpolated lm disabled 2015-05-18 17:26:09 +01:00
Rico Sennrich
f85dd85f6b ignore-unless magic 2015-05-18 16:17:33 +01:00
Rico Sennrich
59376f500b still confused about pass-unless vs. ignore-unless 2015-05-18 14:40:56 +01:00
Rico Sennrich
45a97f9016 EMS: disable concatenated LM by default 2015-05-18 14:10:29 +01:00
Rico Sennrich
27fd45d088 ems: training LM on concatenation of all LM training corpora 2015-05-18 12:18:49 +01:00
Jeroen Vermeulen
e2a632a2b8 JavaScript lint. 2015-05-17 21:36:07 +07:00
Jeroen Vermeulen
5d0bbb6a45 Fix some JavaScript lint. Still a lot left. 2015-05-17 21:24:04 +07:00
Jeroen Vermeulen
a25193cc5d Fix a lot of lint, mostly trailing whitespace.
This is lint reported by the new lint-checking functionality in beautify.py.
(We can change to a different lint checker if we have a better one, but it
would probably still flag these same problems.)

Lint checking can help a lot, but only if we get the lint under control.
2015-05-17 20:04:04 +07:00
Jeroen Vermeulen
61162dd242 Fix more Python lint.
Most of the complaints fixed here were from Pocketlint, but many were also
from Syntastic the vim plugin.
2015-05-16 17:26:56 +07:00
Hieu Hoang
abfc0671a3 osm tweaks and morfessor wrapper 2015-05-12 20:19:39 +04:00
Hieu Hoang
8bb18b9ff0 add no-splitter-training argument. Splitter to be used by mada 2015-05-11 15:26:50 +04:00
Barry Haddow
85c1af4d72 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2015-05-08 09:16:55 +01:00
Barry Haddow
f403f5e478 mmsapt doesn't require feature weights on first tuning iteration 2015-05-08 09:16:51 +01:00
Hieu Hoang
2acb590394 output bleu for multi-bleu hack 2015-05-05 17:54:35 +04:00
Hieu Hoang
d006c6ef8c don't output remaining args twice 2015-05-05 12:15:08 +04:00
Hieu Hoang
8f272e04a9 output debugging messages to stderr, not stdout 2015-05-05 12:01:21 +04:00
Hieu Hoang
d456d9229e add multi-bleu-detok. Like multi-bleu scoring but will detokenize/post-process before scoring 2015-05-03 14:07:12 +04:00
Philipp Koehn
a4a7c14593 allow breaking up training data for fast align (to avoid memory blowups for very large corpora) 2015-05-01 17:47:08 -04:00
Philipp Koehn
b369699661 various small changes, mostly related to better compliance with grid engine 2015-05-01 17:44:18 -04:00
Rico Sennrich
e98a2fc980 fix interpolation for LM with parser in pre-processing 2015-04-30 15:46:33 +01:00
Hieu Hoang
4b47e1148c use ignore-unless /Philipp Koehn 2015-04-22 23:02:57 +04:00
Hieu Hoang
40933b4a78 hack to allow target side of tokenized parallel corpus to be used for LM 2015-04-22 19:01:12 +04:00
Hieu Hoang
ab01d30687 make sure GetOptions doesn consume -T by confusing it with --text 2015-04-21 17:53:46 +04:00
Rico Sennrich
15d3c3f259 be more tolerant about xml input 2015-04-21 14:04:25 +01:00
Rico Sennrich
5a3d5b6bdd EMS: LM:mock-parse can be actual parser 2015-04-21 10:21:24 +01:00
Hieu Hoang
1b9dc6cfae more butinah tweaks 2015-04-19 11:50:50 +04:00