Hieu Hoang
e27f6b0120
Merge branch 'master' of github.com:moses-smt/mosesdecoder
2014-11-15 14:32:49 +00:00
Hieu Hoang
67ad197d5a
take out PYTHONIOENCODING=utf-8. Rely on Rico's python changes
2014-11-15 14:32:31 +00:00
Rico Sennrich
b0b5eef0c6
fix metric interpolation with mert
2014-11-14 14:35:32 +00:00
Hieu Hoang
acd3ac964a
set PYTHONIOENCODING=utf-8 before running merge_alignment.py
2014-11-14 14:34:31 +00:00
Phil Williams
5240c430ce
Merge s2t branch
...
This adds a new string-to-tree decoder, which can be enabled with the -s2t
option. It's intended to be faster and simpler than the generic chart
decoder, and is designed to support lattice input (still WIP). For a en-de
system trained on WMT14 data, it's approximately 40% faster in practice.
For background information, see the decoding section of the EMNLP tutorial
on syntax-based MT:
http://www.emnlp2014.org/tutorials/5_notes.pdf
Some features are not implemented yet, including support for internal tree
structure and soft source-syntactic constraints.
2014-11-04 13:13:56 +00:00
Hieu Hoang
834a89d96b
utf8 encoding /Tomas Fulajtar
2014-10-24 07:33:48 -07:00
Hieu Hoang
6c9c3e1741
portable call to bash /Paul Guyot
2014-10-14 16:01:15 +01:00
Philipp Koehn
34cc9461fb
More Penn Tree Bank compliance (code by Maria Nadejde and Philip Williams
2014-10-10 16:51:32 +01:00
Rico Sennrich
f63807f957
more robust regex
2014-09-30 15:43:38 +01:00
Rico Sennrich
84ad576750
explicitly set BLEU as default scorer (for return-best-dev)
...
(evaluator doesn't accept --scconfig without --sctype)
2014-09-24 14:47:58 +01:00
Rico Sennrich
59cd4be2c9
don't use optimizer-specific options in extractor/evaluator
2014-09-22 10:49:20 +01:00
Rico Sennrich
d39cbca0b9
(optionally) use n-best file for evaluator/return-best-dev
...
this adds support for metrics that rely on alignment / trees
2014-09-22 10:49:20 +01:00
Rico Sennrich
3d00e5dc8c
basic support for more metrics with kbmira
...
metrics need getReferenceLength (for background smoothing) to work with kbmira
2014-09-22 10:49:20 +01:00
Philipp Koehn
ab90efe4af
allow specification of default weights
2014-09-22 05:28:57 +01:00
Philipp Koehn
e9db2fe4aa
Merge branch 'master' of git://github.com/moses-smt/mosesdecoder
2014-09-21 06:04:22 +01:00
Philipp Koehn
3740c9f248
bug fix mmsapt training
2014-09-21 06:02:35 +01:00
Rico Sennrich
0861b464c5
use square brackets with output format '--brackets' (for cleaner escaping and consistency with decoder tree output)
2014-09-15 14:37:52 +01:00
Rico Sennrich
4da0ffc926
use configured scorer (and not always BLEU) for --return-best-dev
2014-09-12 19:17:23 +02:00
Michael Denkowski
057066ea0e
Minor fixes for simulated post-editing with mert-moses.pl
2014-08-13 15:58:51 -04:00
Hieu Hoang
94c44c03d5
merge
2014-08-13 18:03:05 +01:00
Matthias Huck
c27cbf55ea
source labels: integration into EMS
2014-08-07 21:02:51 +01:00
Michael Denkowski
e7c36ee804
Simulated post-editing merge: XML update, parallel SPE script, MERT
2014-08-05 14:20:00 -04:00
Matthias Huck
3a5dee12e8
implementation of phrase orientation in GHKM extraction
...
(...but a corresponding feature function for the chart-based decoder has not been written yet)
2014-07-28 18:27:12 +01:00
Barry Haddow
bfb5cca518
Merge branch 'master' of github.com:moses-smt/mosesdecoder
...
Conflicts:
util/read_compressed.cc
2014-07-23 09:40:55 +01:00
Philipp Koehn
55ae15a6f8
integration of Uli Germann's memory mapped suffix array phrase table into EMS
2014-07-22 10:12:14 -04:00
Barry Haddow
2a611194a2
reinstate new kbmira args
2014-07-21 11:43:37 +01:00
Hieu Hoang
a3bd695cd4
factor for oov is 0, not <unk> - interferes with source input. Add extra argument to lowercase input words or not
2014-07-13 02:54:58 +01:00
Hieu Hoang
da84cce8c4
Merge github.com:moses-smt/mosesdecoder into hieu
2014-06-09 16:20:29 +01:00
Rico Sennrich
169c3fce38
convert CoNNL-X to Moses XML format
2014-06-09 15:24:41 +01:00
Hieu Hoang
091ce3f016
Merge ../mosesdecoder into hieu
2014-06-06 17:25:26 +01:00
Philipp Koehn
fc8e588f25
kbmira bug fix & factor handling
2014-06-06 14:20:57 +01:00
phikoehn
ac7670c5e7
minor bugs with factors
2014-06-06 14:14:35 +01:00
Hieu Hoang
b589c3d5c2
Merge ../mosesdecoder into hieu
2014-06-06 11:11:47 +01:00
phikoehn
7fc3ccd968
Merge branch 'master' of ssh://github.com/moses-smt/mosesdecoder
2014-06-05 21:37:20 +01:00
Philipp Koehn
243004bda6
utf8 compatible
2014-06-05 21:36:18 +01:00
phikoehn
ceadacd3af
Merge branch 'master' of ssh://github.com/moses-smt/mosesdecoder
2014-06-05 21:33:35 +01:00
Philipp Koehn
15288213be
allow < > in factors
2014-06-05 21:31:09 +01:00
Hieu Hoang
a17a45fa7f
span length
2014-06-05 17:20:38 +01:00
Hieu Hoang
ce2a69ba25
Merge ../mosesdecoder into hieu
2014-06-05 17:18:26 +01:00
Kenneth Heafield
d82bd475a2
Nadir Durrani asked me to add this script
2014-06-04 11:27:36 -07:00
Hieu Hoang
8065bf7467
add span length as a score option to train-model.perl
2014-06-04 18:06:06 +01:00
Hieu Hoang
a270811d84
Merge ../mosesdecoder into hieu
2014-05-30 09:31:14 +01:00
Barry Haddow
00b1d83841
Remove debug flag
2014-05-27 08:55:05 +01:00
Barry Haddow
66b5d2f3fd
parse hgmira output correctly
2014-05-26 11:03:28 +01:00
Barry Haddow
6c31fbb2a4
Support for hypergraph mira
2014-05-22 21:20:14 +01:00
Hieu Hoang
def22cef44
Merge branch 'hieu' of github.com:hieuhoang/mosesdecoder into hieu
2014-04-18 18:11:56 +01:00
Hieu Hoang
8d69327eb1
merge
2014-04-17 20:22:40 +01:00
Nadir Durrani
5e3e50d4ec
In-Decoding Transliteration Module
2014-04-16 17:28:49 +01:00
Hieu Hoang
f2d3052627
exit on error
2014-04-04 15:30:48 +01:00
Hieu Hoang
9d09b4a6e6
bug in converting chunker output to xml. Didn't handle chunks that crossed sentence boundaries properly
2014-04-04 13:45:18 +01:00