Commit Graph

1794 Commits

Author SHA1 Message Date
XapaJIaMnu
1bac666e5f Fix small oversights 2014-11-13 15:51:48 +00:00
XapaJIaMnu
617ef015df Extend train_nplm with various options 2014-11-13 15:51:48 +00:00
Nikolay Bogoychev
2b2766cce8 For GPU training one thread is optimal 2014-11-13 15:51:48 +00:00
Abmayne
4af68a0d1a Barry's training scripts with some minor changes by me 2014-11-13 15:51:48 +00:00
Phil Williams
59a1ce7380 substitute-filtered-tables.perl: check for RuleTable feature 2014-11-06 11:14:51 +00:00
Phil Williams
5240c430ce Merge s2t branch
This adds a new string-to-tree decoder, which can be enabled with the -s2t
option.  It's intended to be faster and simpler than the generic chart
decoder, and is designed to support lattice input (still WIP).  For a en-de
system trained on WMT14 data, it's approximately 40% faster in practice.

For background information, see the decoding section of the EMNLP tutorial
on syntax-based MT:

  http://www.emnlp2014.org/tutorials/5_notes.pdf

Some features are not implemented yet, including support for internal tree
structure and soft source-syntactic constraints.
2014-11-04 13:13:56 +00:00
mjdenkowski
40e8f2eca0 Hypergraph output 2014-11-03 09:16:12 -05:00
Hieu Hoang
7ca5e4fbc8 blame stats! 2014-10-31 01:07:33 +00:00
Hieu Hoang
834a89d96b utf8 encoding /Tomas Fulajtar 2014-10-24 07:33:48 -07:00
Rico Sennrich
df74aa3e89 use short names for sparse features to save disk space and I/O when tuning 2014-10-17 10:36:51 +01:00
Hieu Hoang
44ce4b361a reduce lmplz memory consumption in recaser 2014-10-14 17:52:47 +01:00
Hieu Hoang
fe266260fb Merge branch 'master' of github.com:moses-smt/mosesdecoder 2014-10-14 16:01:26 +01:00
Hieu Hoang
6c9c3e1741 portable call to bash /Paul Guyot 2014-10-14 16:01:15 +01:00
Philipp Koehn
2638ff0480 added thot to EMS 2014-10-14 10:13:16 -04:00
Phil Williams
07dbd191ed analysis.perl: update regexp for current trace format 2014-10-13 10:55:07 +01:00
mjdenkowski
a1f561ac31 Only update dynamic models 2014-10-10 15:09:53 -04:00
Philipp Koehn
34cc9461fb More Penn Tree Bank compliance (code by Maria Nadejde and Philip Williams 2014-10-10 16:51:32 +01:00
Philipp Koehn
1741bba750 Penn Tree Bank compliant versions of preprocessing 2014-10-10 16:49:06 +01:00
Rico Sennrich
f63807f957 more robust regex 2014-09-30 15:43:38 +01:00
Rico Sennrich
84ad576750 explicitly set BLEU as default scorer (for return-best-dev)
(evaluator doesn't accept --scconfig without --sctype)
2014-09-24 14:47:58 +01:00
Hieu Hoang
610090c2ed don't run truecase trainer unless it's asked for 2014-09-23 21:50:53 +01:00
Rico Sennrich
59cd4be2c9 don't use optimizer-specific options in extractor/evaluator 2014-09-22 10:49:20 +01:00
Rico Sennrich
d39cbca0b9 (optionally) use n-best file for evaluator/return-best-dev
this adds support for metrics that rely on alignment / trees
2014-09-22 10:49:20 +01:00
Rico Sennrich
3d00e5dc8c basic support for more metrics with kbmira
metrics need getReferenceLength (for background smoothing) to work with kbmira
2014-09-22 10:49:20 +01:00
Philipp Koehn
ab90efe4af allow specification of default weights 2014-09-22 05:28:57 +01:00
Philipp Koehn
e9db2fe4aa Merge branch 'master' of git://github.com/moses-smt/mosesdecoder 2014-09-21 06:04:22 +01:00
Philipp Koehn
3740c9f248 bug fix mmsapt training 2014-09-21 06:02:35 +01:00
Philipp Koehn
a8659d1399 support for specified weights 2014-09-21 06:01:16 +01:00
Philipp Koehn
acefdb0262 bug fix for final-step 2014-09-21 05:59:21 +01:00
Rico Sennrich
0861b464c5 use square brackets with output format '--brackets' (for cleaner escaping and consistency with decoder tree output) 2014-09-15 14:37:52 +01:00
Rico Sennrich
4da0ffc926 use configured scorer (and not always BLEU) for --return-best-dev 2014-09-12 19:17:23 +02:00
Ondrej Bojar
14449b3601 towards a simple line-oriented dump of FSA 2014-09-11 14:50:59 +02:00
Ondrej Bojar
c2fa95dfd8 Merge branch 'master' of https://github.com/moses-smt/mosesdecoder 2014-09-10 14:35:52 +02:00
Ondrej Bojar
01e364d1e6 use --n=0 to check coverage of full sents 2014-09-10 14:33:55 +02:00
Hieu Hoang
eb75e58820 Merge pull request #72 from flammie/master
Add Finnish non-breaking prefixes
2014-09-04 16:31:14 +01:00
Flammie Pirinen
1da3df93bc fix location and remove english notes 2014-09-04 16:01:10 +01:00
Nadir
0d59d0e456 Safe System Function OSM Training 2014-09-01 11:48:27 +01:00
Michael Denkowski
b8c9ae2c55 Update models named "Dynamic..." 2014-08-29 14:40:20 -04:00
Michael Denkowski
9098f3a8b4 Support simulated post-editing with MultiModel 2014-08-19 16:20:35 -04:00
Philipp Koehn
a574454635 bug fix with delete crashed step output files 2014-08-14 14:14:42 -04:00
Philipp Koehn
7a087f24df also delete interrupted steps 2014-08-14 10:15:58 -04:00
Michael Denkowski
300de5d041 Text size limits jobs 2014-08-13 16:51:20 -04:00
Michael Denkowski
057066ea0e Minor fixes for simulated post-editing with mert-moses.pl 2014-08-13 15:58:51 -04:00
Hieu Hoang
94c44c03d5 merge 2014-08-13 18:03:05 +01:00
Matthias Huck
c27cbf55ea source labels: integration into EMS 2014-08-07 21:02:51 +01:00
Hieu Hoang
23f10cc73f move notice about czech prefixes to share/README 2014-08-06 15:03:37 +01:00
Michael Denkowski
9ad59e2d69 Header and some instructions 2014-08-05 15:11:35 -04:00
Michael Denkowski
e7c36ee804 Simulated post-editing merge: XML update, parallel SPE script, MERT 2014-08-05 14:20:00 -04:00
Philipp Koehn
7f41bbba67 changes to protecting specified patterns (with example patterns) 2014-08-03 18:22:27 -04:00
Philipp Koehn
e3b26f334f grrrr... 2014-07-31 21:02:08 -04:00