Commit Graph

533 Commits

Author SHA1 Message Date
Hieu Hoang
db1894ad24 consistent output 2018-12-30 12:05:57 +00:00
Hieu Hoang
413ba6b583 increase cores to 16. For bitextor azure pipeline 2018-12-10 16:17:16 +00:00
Hieu Hoang
c753350641 ems config for moses2 2018-12-08 19:47:10 +00:00
Hieu Hoang
3d4bf99367 sacre bleu 2018-12-04 15:40:00 +00:00
Hieu Hoang
dbbc47292f sacre bleu 2018-12-04 15:27:09 +00:00
Hieu Hoang
345dabcde6 use --discount_fallback 2018-12-04 14:34:47 +00:00
Barry Haddow
d2b558728f basic support for Gujarati and Hindi, backported from one of the many upstreams 2018-10-30 14:16:16 +00:00
Rico Sennrich
411f45f249 multi-bleu-detok should take raw reference 2018-09-26 12:24:07 +01:00
Joachim Wagner
2aa5cd2152
fix syntax error in regular expression 2018-06-22 18:16:11 +01:00
Tomas Fulajtar
3a2a63b9dc * Added missing step for the "TRAINING:build-generation-custom".
* Fixed the $cmd parameter - should be "-corpus" instead of "-generation-corpus".
2018-05-18 14:18:11 +02:00
Hieu Hoang
999e83d128
Merge pull request #196 from astronautguo/master
fix bug when copying to cache
2018-05-04 14:42:35 +01:00
Kenneth Heafield
ae47469919 Don't drop last character if file does not end with newline 2018-05-03 10:28:11 +01:00
astro
f47e670f20 fix bug when copying to cache 2018-04-27 19:52:20 -04:00
Rico Sennrich
b99af32113 fix split-input if it is passed, but if output-splitter is defined 2017-04-24 12:16:36 +01:00
Phil Williams
a5c99ca660 reference-from-sgm.perl: fix Perl error 2017-03-07 15:54:08 +00:00
Linas Vepstas
8fdd19310b Update to applly CJK processing conditionally. 2017-01-11 11:23:54 -06:00
Linas Vepstas
2e48f83ab4 Handle punctuation+CJK combinations. 2017-01-08 10:08:53 -06:00
Linas Vepstas
6fb2c97029 Bug-fix: regular Western sentence enders not recognized. 2017-01-05 23:29:00 -06:00
Linas Vepstas
1933bcbf33 Whoops, revert cut-n-paste damage in previous commit. 2017-01-05 11:39:01 -06:00
Linas Vepstas
144f43495e Preliminary support for Chinese.
Also, cleanup some of the comments.
2017-01-05 11:33:10 -06:00
Linas Vepstas
9f5500a3a8 oops. 2017-01-05 10:09:34 -06:00
Linas Vepstas
ab6816f9a7 Purely cosmetic cleanup.
Use same indentation style throughout; wrap long lines; capitalize
sentences; add punctuation; remove trailing whitespace.
2017-01-05 10:08:06 -06:00
Hieu Hoang
ff12a13eaa re-tune if decoder changed. eg moses -> moses2 2017-01-02 16:37:56 -05:00
Hieu Hoang
3d5500e698 Merge branch 'perf_moses2' of github.com:hieuhoang/mosesdecoder into perf_moses2 2016-09-27 08:21:34 -04:00
Hieu Hoang
a29f7d5c99 can define srilm-dir in general section 2016-09-27 08:21:18 -04:00
Hieu Hoang
9527fb050d duplicate -T arg for OSM 2016-09-26 12:04:33 +01:00
Hieu Hoang
9236eeeba9 add brodie to list of machines 2016-08-13 18:27:25 +01:00
Hieu Hoang
a8325a3e8e make probing pt work with ems 2016-08-11 15:46:43 +01:00
Philipp Koehn
ef9d327841 only train input or output truecaser, if only one is needed 2016-07-10 11:17:21 -04:00
Philipp Koehn
defbf8d7c3 barebone support for quality estimation in experiment.perl 2016-06-04 05:15:34 -04:00
Philipp Koehn
942eb5a8b1 allow configuration of operation sequence model loading, allow specification of KENLM/OSM loading in experiment.perl / train-model.perl 2016-05-29 11:46:42 -04:00
Philipp Koehn
c07e6faed8 farasa_moses.sh not default tokenizer 2016-05-25 03:39:00 -04:00
Nadir
bb7263f0f9 Arabic Tokenizer 2016-05-04 16:20:55 +01:00
bicici
482e8f744f Update config.basic 2016-05-01 12:00:01 +03:00
bicici
6edf6d2df2 Update config.basic 2016-05-01 11:47:44 +03:00
Matthias Huck
1659d6b4c8 Option for target constituent constrained phrase extraction. TargetConstituentAdjacencyFeature. 2016-02-12 17:46:57 +00:00
shuoyangd
7cf3b23962 fix search graph parsing for CYK+ 2016-02-04 17:21:24 -05:00
shuoyangd
1286791ba1 add nnjm-settings to access options in train_nplm.py 2016-02-04 17:18:23 -05:00
Philipp Koehn
9132ab1ded bug fix of bug fix of generation table training 2016-02-01 14:01:06 -05:00
Philipp Koehn
904afd175d prune-generation step invaludated orginial build-generation step 2016-01-31 11:56:37 -05:00
Philipp Koehn
b4725e1c91 do not interpret $0 as a EMS settings variable 2016-01-31 11:55:44 -05:00
Matthias Huck
1d3feba8d0 preparing extraction of Hiero soft syntactic preferences (target syntax) 2016-01-09 23:02:31 +00:00
Barry Haddow
977e8eaf67 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2016-01-06 11:55:16 +00:00
Barry Haddow
7125096c29 enable nplm training on separate host, fix ems for nplm 2016-01-06 11:55:12 +00:00
Matthias Huck
bd3f573452 Hiero phrase orientation 2015-12-10 12:56:37 +00:00
Philipp Koehn
33f4e93915 no binarizing/filtering with mmsapt 2015-12-01 23:10:37 +00:00
Philipp Koehn
94cd1f7433 when building mmsapt phrase table, also use mmsapt reordering table 2015-11-23 18:12:56 -05:00
Barry Haddow
ccfe8ba018 remove unused method, and misleading comment 2015-11-10 21:35:08 +00:00
Matthias Huck
62748f5296 Revert "EMS: fix filtering issue when output-splitter is defined"
This reverts commit d5c41634e8.
2015-10-12 18:05:46 +01:00
Nadir
15b4aa91b0 Extra Space 2015-10-08 13:16:58 +01:00