Commit Graph

223 Commits

Author SHA1 Message Date
Joachim Wagner
2aa5cd2152
fix syntax error in regular expression 2018-06-22 18:16:11 +01:00
Tomas Fulajtar
3a2a63b9dc * Added missing step for the "TRAINING:build-generation-custom".
* Fixed the $cmd parameter - should be "-corpus" instead of "-generation-corpus".
2018-05-18 14:18:11 +02:00
Hieu Hoang
a29f7d5c99 can define srilm-dir in general section 2016-09-27 08:21:18 -04:00
Philipp Koehn
942eb5a8b1 allow configuration of operation sequence model loading, allow specification of KENLM/OSM loading in experiment.perl / train-model.perl 2016-05-29 11:46:42 -04:00
Matthias Huck
1659d6b4c8 Option for target constituent constrained phrase extraction. TargetConstituentAdjacencyFeature. 2016-02-12 17:46:57 +00:00
shuoyangd
1286791ba1 add nnjm-settings to access options in train_nplm.py 2016-02-04 17:18:23 -05:00
Philipp Koehn
b4725e1c91 do not interpret $0 as a EMS settings variable 2016-01-31 11:55:44 -05:00
Matthias Huck
1d3feba8d0 preparing extraction of Hiero soft syntactic preferences (target syntax) 2016-01-09 23:02:31 +00:00
Matthias Huck
bd3f573452 Hiero phrase orientation 2015-12-10 12:56:37 +00:00
Philipp Koehn
94cd1f7433 when building mmsapt phrase table, also use mmsapt reordering table 2015-11-23 18:12:56 -05:00
Barry Haddow
90f15cc619 extra nplm settings 2015-09-04 10:07:50 +01:00
Barry Haddow
f808b32030 support version of nplm that picks best on heldout 2015-08-03 16:47:25 +01:00
Philipp Koehn
836ca8212a better support of grid engine cluster 2015-07-29 11:03:24 -04:00
Barry Haddow
e53ad40859 Support for nplm in ems 2015-07-23 10:37:26 +01:00
Philipp Koehn
496f8c6d85 only extract reordering phrase pairs if use mmsapt phrase table 2015-07-20 11:44:22 -04:00
Rico Sennrich
1b1bafb1e8 ems: add option to factorize after truecase/split/etc. 2015-07-20 10:43:23 +01:00
Philipp Koehn
66ecf98cf7 minor bug fix 2015-07-14 11:01:22 -04:00
Rico Sennrich
ca72105fdf fix ems regression 2015-07-14 13:16:25 +01:00
Philipp Koehn
7e3050f7f2 allow saving of model from fast-align (for incremental use) 2015-07-14 05:27:03 -04:00
Barry Haddow
3fdbb00904 Improvements to handling of bilingual LM in EMS 2015-07-10 15:44:24 +01:00
Jeroen Vermeulen
ef028446f3 Add license notices to scripts.
This is not pleasant to read (and much, much less pleasant to write!) but
sort of necessary in an open project.  Right now it's quite hard to figure
out what is licensed how, which doesn't matter much to most people but can
suddenly become very important when people want to know what they're being
allowed to do.

I kept the notices as short as I could.  As far as I could see, everything
without a clear license notice is LGPL v2.1 or later.
2015-05-29 18:30:26 +07:00
Rico Sennrich
6aac7ded9a EMS: more flexible way to concatenate LM training data.
the implementation allows the user to specify which corpora to combine,
and to have multiple LMs on the same data.
2015-05-20 17:20:02 +01:00
Rico Sennrich
8ca6764c7d ems: allow LMs with user-specified training commands and moses.ini config entries
intended for neural LMs, syntactic LMs, and the like. currently doesn't play nice with INTERPOLATED-LM.
2015-05-18 19:07:37 +01:00
Rico Sennrich
fb06a2325e fix broken ems with interpolated lm disabled 2015-05-18 17:26:09 +01:00
Rico Sennrich
27fd45d088 ems: training LM on concatenation of all LM training corpora 2015-05-18 12:18:49 +01:00
Jeroen Vermeulen
a25193cc5d Fix a lot of lint, mostly trailing whitespace.
This is lint reported by the new lint-checking functionality in beautify.py.
(We can change to a different lint checker if we have a better one, but it
would probably still flag these same problems.)

Lint checking can help a lot, but only if we get the lint under control.
2015-05-17 20:04:04 +07:00
Barry Haddow
85c1af4d72 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2015-05-08 09:16:55 +01:00
Barry Haddow
f403f5e478 mmsapt doesn't require feature weights on first tuning iteration 2015-05-08 09:16:51 +01:00
Philipp Koehn
b369699661 various small changes, mostly related to better compliance with grid engine 2015-05-01 17:44:18 -04:00
Hieu Hoang
6162223690 add use warnings to all perl scripts 2015-04-13 20:42:33 +04:00
Hieu Hoang
b2f9ba2b64 revert last commit to add MASTER_PATH. Not needed 2015-04-02 19:29:42 +04:00
Hieu Hoang
27b36e0c96 pass in PATH variable from master node. When you're running of a grid but really just qsubbing everything to 1 slave node 2015-04-02 19:15:21 +04:00
Hieu Hoang
2d1da3219d consistently use 'env perl' command for environments where the 1st perl in PATH isn't the default perl. Which is kinda stupid 2015-04-02 17:38:56 +04:00
Phil Williams
0a8e5fb3bf EMS: fix TRAINING:use-syntax-input-weight-feature option 2015-03-13 17:18:56 +00:00
Philipp Koehn
1632c5f39d proper handling of specified configuration file 2015-03-11 16:49:20 +00:00
Matthias Huck
01bed83cf9 GHKM extraction: option to strip non-terminal labels from BitPar syntactic parses right during extraction (i.e., remove any suffix starting with a hyphen from the label) 2015-03-10 21:25:32 +00:00
Phil Williams
9e2eb702dc EMS: add TRAINING:use-syntax-input-weight-feature option 2015-03-10 11:40:49 +00:00
Phil Williams
7eba58b942 EMS: add TRAINING:dont-tune-glue-grammar option
Adds -dont-tune-glue-grammar to train-model.perl command during config file
generation step.  This is preferable to manually adding -dont-tune-glue-grammar
to TRAINING:training-options because changing its value won't trigger a re-run
of dependent steps that don't really need re-running (like word alignment).
2015-03-10 10:20:19 +00:00
Matthias Huck
25f5470216 GHKM: write target parts-of-speech as a factor 2015-03-09 21:54:03 +00:00
Matthias Huck
06e87d851e GHKM: extract POS phrase property (from preterminals in the syntactic parse tree) 2015-03-04 21:40:56 +00:00
Phil Williams
90e8d4940c EMS: add TRAINING:no-glue-grammar option 2015-03-03 12:36:09 +00:00
Philipp Koehn
2638ff0480 added thot to EMS 2014-10-14 10:13:16 -04:00
Philipp Koehn
acefdb0262 bug fix for final-step 2014-09-21 05:59:21 +01:00
Philipp Koehn
a574454635 bug fix with delete crashed step output files 2014-08-14 14:14:42 -04:00
Philipp Koehn
7a087f24df also delete interrupted steps 2014-08-14 10:15:58 -04:00
Matthias Huck
c27cbf55ea source labels: integration into EMS 2014-08-07 21:02:51 +01:00
Matthias Huck
3a5dee12e8 implementation of phrase orientation in GHKM extraction
(...but a corresponding feature function for the chart-based decoder has not been written yet)
2014-07-28 18:27:12 +01:00
phikoehn
2d11fe3916 Merge branch 'master' of ssh://github.com/moses-smt/mosesdecoder 2014-07-23 15:40:04 +01:00
phikoehn
2239501b21 allow specification of weights for lm interpolation 2014-07-23 15:39:42 +01:00
Philipp Koehn
55ae15a6f8 integration of Uli Germann's memory mapped suffix array phrase table into EMS 2014-07-22 10:12:14 -04:00