Matthias Huck
c27cbf55ea
source labels: integration into EMS
2014-08-07 21:02:51 +01:00
Ulrich Germann
df3fb4ac5c
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
...
Conflicts:
doc/Mmsapt.howto
2014-08-04 17:26:15 +01:00
Ulrich Germann
2711360ce7
Added missing #include.
2014-08-04 17:19:58 +01:00
Barry Haddow
65b3e0b96e
Missing include
2014-08-01 11:13:34 +01:00
Matthias Huck
7b02017da1
use std::numeric_limits
2014-07-28 19:49:43 +01:00
Matthias Huck
3a5dee12e8
implementation of phrase orientation in GHKM extraction
...
(...but a corresponding feature function for the chart-based decoder has not been written yet)
2014-07-28 18:27:12 +01:00
Matthias Huck
d0e92da734
GHKM extraction can add a source labels phrase property
2014-06-11 19:27:18 +01:00
Nicola Bertoldi
0ca98837db
beautify
2014-05-19 15:35:33 +02:00
Nicola Bertoldi
20381cbf89
merged master into dynamic-models and solved conflicts
2014-04-28 19:18:38 +02:00
Rico Sennrich
c8682e9420
target-syntax: use SoftMatchingFeature to assign non-terminal to unknown words
2014-03-24 14:57:24 +00:00
Matthias Huck
65811a0325
tree fragments: tiny issues with the extraction pipeline
2014-02-03 18:13:10 +00:00
Nicola Bertoldi
e452a13062
beautify
2014-01-15 16:49:57 +01:00
Phil Williams
6bee77e207
extract-ghkm: use square brackets for glue rule internal tree structure
2013-11-12 15:49:49 +00:00
Hieu Hoang
24f95297fc
compiles with clang
2013-10-31 12:46:41 +00:00
Phil Williams
2a28d1a73e
Merge branch 'master' into GHKMStruct
...
Conflicts:
moses-chart-cmd/IOWrapper.cpp
moses-chart-cmd/IOWrapper.h
moses/FF/Factory.cpp
moses/Parameter.cpp
moses/StaticData.h
phrase-extract/extract-ghkm/ScfgRuleWriter.cpp
phrase-extract/score-main.cpp
2013-09-29 15:27:09 +01:00
Phil Williams
e497dc4857
Remove NT length code missed in commit cdd9df19...
2013-09-29 15:09:14 +01:00
Phil Williams
940591a1a3
extract-ghkm: allow trailing whitespace in alignment file
...
Thanks to Matt Post for reporting the problem.
2013-09-26 15:49:08 +01:00
Phil Williams
23488e1adb
extract-ghkm: use square brackets for --TreeFragments
...
Use square brackets instead of round brackets for internal tree
structure. This avoids the need for additional escaping since
square brackets are already escaped in Moses.
Also: tweak code style to match the rest of the source file, and
output less whitespace to make the extract files (marginally)
smaller.
2013-09-20 14:57:40 +01:00
Matthias Huck
a6d172e0f1
command line option for extract-ghkm: --TreeFragments
2013-09-16 20:06:02 +01:00
Matthias Huck
c39bed60c0
Tree fragments in GHKM glue rules;
...
output of LHS tag in tree fragments for UNKs;
GHKMParse info is now denoted as Tree info
2013-09-13 17:10:21 +02:00
Matthias Huck
96d14555fc
GHKM tree output during extraction: modified extract-ghkm and score tools
2013-09-11 16:46:37 +02:00
Matthias Huck
004c44faf1
prototype GHKM tree output from extract-ghkm (still flawed)
2013-09-10 15:41:26 +02:00
Phil Williams
f0b603e6b5
extract-ghkm: write glue grammars for all sentence offsets
...
extract-parallel now merges separate glue grammars, so remove
previous workaround.
2013-07-25 13:53:32 +01:00
Phil Williams
b5584fdecf
extract-ghkm: workaround for extract-parallel issue
...
Don't write glue grammar or unknown word label files unless the sentence
offset is 0. This prevents multiple instances of extract-ghkm writing
to the same two files when extract-parallel is used.
TODO Better solutions might be:
1. modify extract-parallel so that it only configures one instance of
extract-ghkm to write the glue / unknown-lhs files (like the current
workaround, this assumes file chunks are representative of the whole)
2. add multithreading support directly to extract-ghkm
3. write distinct output files for each extract-ghkm instance and
combine them on completion
2013-07-23 14:55:16 +01:00
Hieu Hoang
6249432407
beautify
2013-05-29 18:16:15 +01:00
Phil Williams
139148bc8f
extract-ghkm and friends: don't unescape special characters
...
Don't unescape special characters when reading XML parse trees in
extract-ghkm, extract-rules, and relax-parse.
2012-12-17 20:08:02 +00:00
Phil Williams
0ca5b8932a
extract-ghkm: tweak label collection for unknown words
...
Produce a better label set when unary rule elimination is enabled.
2012-12-17 19:43:42 +00:00
Phil Williams
fb8d20a22f
extract-ghkm: --UnknownWordMinRelFreq, --UnknownWordUniform
2012-12-17 19:02:30 +00:00
Kenneth Heafield
ddd3cc1d8a
Fix extract-ghkm compilation
2012-11-12 16:50:46 +00:00
Phil Williams
0851a4d113
extract-ghkm: add --SentenceOffset option
...
This should behave the same as the --SentenceOffset option for
extract-rules. The extract-parallel.perl script expects the rule
extractor to have this option.
2012-10-03 20:04:09 +01:00
Hieu Hoang
121e258e84
namespace all classes in mert directory
2012-06-30 21:39:10 +01:00
Hieu Hoang
4eef94b121
move c++ code out of /script/ to /
2012-05-31 17:24:06 +01:00