Matthias Huck
643fa18805
Merge branch 'GHKMStruct' of github.com:moses-smt/mosesdecoder into GHKMStruct
2013-09-13 17:13:20 +02:00
Matthias Huck
c39bed60c0
Tree fragments in GHKM glue rules;
...
output of LHS tag in tree fragments for UNKs;
GHKMParse info is now denoted as Tree info
2013-09-13 17:10:21 +02:00
maria nadejde
fad57a60a7
comment for Equal implementation
2013-09-13 16:13:36 +02:00
maria nadejde
5615a11766
sparse feature weight file
2013-09-13 16:06:48 +02:00
maria nadejde
bff123635e
added Dense and Sparse feature to scorer
2013-09-13 12:45:46 +02:00
maria nadejde
43a9323d0f
add feature files
2013-09-12 18:46:40 +02:00
maria nadejde
67b873b67d
mock feature
2013-09-12 18:40:08 +02:00
Matthias Huck
96d14555fc
GHKM tree output during extraction: modified extract-ghkm and score tools
2013-09-11 16:46:37 +02:00
Matthias Huck
004c44faf1
prototype GHKM tree output from extract-ghkm (still flawed)
2013-09-10 15:41:26 +02:00
Rico Sennrich
b421f7c9b0
refactoring to minimize overhead from flexibility score code (if off)
2013-09-07 23:04:40 +02:00
Rico Sennrich
7138056b8f
flexibility scores
2013-09-07 23:04:01 +02:00
Hieu Hoang
77872f7521
beautify
2013-07-30 15:04:37 +01:00
Hieu Hoang
9cdcf713a6
phrase penalty now has it's own ff. No longer in the phrase table
2013-07-29 12:55:44 +01:00
Hieu Hoang
9e8402dedd
add placeholder support to extract
2013-07-26 15:46:15 +01:00
Hieu Hoang
e3917f911b
add placeholder support to extract
2013-07-26 15:44:29 +01:00
Hieu Hoang
2ba7a372e8
add placeholder support to extract
2013-07-26 14:12:27 +01:00
Hieu Hoang
4fde5f7ea2
eclipse file for extract-rules
2013-07-26 12:27:55 +01:00
Phil Williams
f0b603e6b5
extract-ghkm: write glue grammars for all sentence offsets
...
extract-parallel now merges separate glue grammars, so remove
previous workaround.
2013-07-25 13:53:32 +01:00
Phil Williams
b5584fdecf
extract-ghkm: workaround for extract-parallel issue
...
Don't write glue grammar or unknown word label files unless the sentence
offset is 0. This prevents multiple instances of extract-ghkm writing
to the same two files when extract-parallel is used.
TODO Better solutions might be:
1. modify extract-parallel so that it only configures one instance of
extract-ghkm to write the glue / unknown-lhs files (like the current
workaround, this assumes file chunks are representative of the whole)
2. add multithreading support directly to extract-ghkm
3. write distinct output files for each extract-ghkm instance and
combine them on completion
2013-07-23 14:55:16 +01:00
Hieu Hoang
310b26f989
beautify
2013-07-08 20:52:14 +01:00
Hieu Hoang
3eba5782c2
beautify
2013-07-08 20:25:47 +01:00
Hieu Hoang
dc33fa3d3d
redo parsing of feature function parameters
2013-06-20 12:50:41 +01:00
Hieu Hoang
abe6bb7c22
refactor parsing of feature functiona args
2013-06-10 18:11:55 +01:00
Hieu Hoang
6249432407
beautify
2013-05-29 18:16:15 +01:00
phikoehn
41da5b2760
Merge branch 'master' of git://github.com/moses-smt/mosesdecoder
2013-05-12 08:16:22 +01:00
Rico Sennrich
ce5311c076
fix undefined behaviour in rule extract (thanks Barry)
2013-05-07 10:50:19 +02:00
Rico Sennrich
a52f0a8c4d
avoid costly copy operation in extract-rules
...
(noticeable speed-up with large number of non-terminals:
2x speed-up in benchmark with target syntax and --MaxNonTerm 5)
2013-05-03 10:48:14 +02:00
phikoehn
d19a28ae21
Merge branch 'master' of git://github.com/moses-smt/mosesdecoder
2013-05-01 19:22:00 +01:00
phikoehn
cd8915647b
support for Chris Dyer's fast-align; bug fix with sparse word translations feature; threshold pruning in filter
2013-05-01 19:20:05 +01:00
Rico Sennrich
4e87a012d0
fix two bugs with relax-parse:
...
- size of sentence was not calculated correctly
(instead, number of positions at which a subtree starts was used)
- code entered an infinitive loop sometimes; added break condition
2013-04-25 17:27:50 +02:00
phikoehn
5ba153806b
fixed kneserNey phrase probability smoothing bug reported by Česlav Przywara <ceslav@przywara.cz>
2013-03-13 17:52:24 +00:00
Barry Haddow
5f1be3217b
bugifx format of extract file for instance weighting
2013-03-07 21:40:43 +00:00
Rico Sennrich
e3ea93acb7
speed up rule extraction by factor 2 (by rewriting rule consolidation to have linear instead of quadratic complexity)
2013-02-06 13:10:38 +01:00
Barry Haddow
8db90fd2ac
instance weighting for lex reordering
2013-01-10 19:46:19 +00:00
Barry Haddow
2e8bad22e4
lex reordering scoring uses FilePiece/StringPiece
2013-01-09 17:38:48 +00:00
Barry Haddow
861792bfc5
extract can read an instance weights file.
...
Still have to parallelise.
2012-12-21 15:39:25 +00:00
Phil Williams
139148bc8f
extract-ghkm and friends: don't unescape special characters
...
Don't unescape special characters when reading XML parse trees in
extract-ghkm, extract-rules, and relax-parse.
2012-12-17 20:08:02 +00:00
Phil Williams
0ca5b8932a
extract-ghkm: tweak label collection for unknown words
...
Produce a better label set when unary rule elimination is enabled.
2012-12-17 19:43:42 +00:00
Phil Williams
fb8d20a22f
extract-ghkm: --UnknownWordMinRelFreq, --UnknownWordUniform
2012-12-17 19:02:30 +00:00
Hieu Hoang
d0cf8f47db
order of lexical probability has flipped
2012-11-22 17:37:36 +00:00
Barry Haddow
a90e1861c0
Alignments on by default for phrase-based
2012-11-15 12:35:43 +00:00
Hieu Hoang
f96b33de83
only include moses root when compiling
2012-11-14 13:43:04 +00:00
Hieu Hoang
5e3ef23cef
move moses/src/* to moses/
2012-11-12 19:56:18 +00:00
Kenneth Heafield
7d692496c3
More little jamfile changes
2012-11-12 16:57:56 +00:00
Kenneth Heafield
d74b784ad2
And pcfg-common too...
2012-11-12 16:53:42 +00:00
Kenneth Heafield
ddd3cc1d8a
Fix extract-ghkm compilation
2012-11-12 16:50:46 +00:00
Kenneth Heafield
62d37fa2b6
Refactor phrase-extract/Jamfile
2012-11-12 14:17:48 +00:00
Hieu Hoang
e75522b602
rename functions
2012-11-09 18:55:01 -05:00
Hieu Hoang
db29cc50d0
xcode
2012-11-09 18:13:20 -05:00
Kenneth Heafield
cd00219fa4
Missing dependency
2012-11-04 16:26:47 -05:00