Phil Williams
f0b603e6b5
extract-ghkm: write glue grammars for all sentence offsets
...
extract-parallel now merges separate glue grammars, so remove
previous workaround.
2013-07-25 13:53:32 +01:00
Phil Williams
b5584fdecf
extract-ghkm: workaround for extract-parallel issue
...
Don't write glue grammar or unknown word label files unless the sentence
offset is 0. This prevents multiple instances of extract-ghkm writing
to the same two files when extract-parallel is used.
TODO Better solutions might be:
1. modify extract-parallel so that it only configures one instance of
extract-ghkm to write the glue / unknown-lhs files (like the current
workaround, this assumes file chunks are representative of the whole)
2. add multithreading support directly to extract-ghkm
3. write distinct output files for each extract-ghkm instance and
combine them on completion
2013-07-23 14:55:16 +01:00
Hieu Hoang
310b26f989
beautify
2013-07-08 20:52:14 +01:00
Hieu Hoang
3eba5782c2
beautify
2013-07-08 20:25:47 +01:00
Hieu Hoang
dc33fa3d3d
redo parsing of feature function parameters
2013-06-20 12:50:41 +01:00
Hieu Hoang
abe6bb7c22
refactor parsing of feature functiona args
2013-06-10 18:11:55 +01:00
Hieu Hoang
6249432407
beautify
2013-05-29 18:16:15 +01:00
phikoehn
41da5b2760
Merge branch 'master' of git://github.com/moses-smt/mosesdecoder
2013-05-12 08:16:22 +01:00
Rico Sennrich
ce5311c076
fix undefined behaviour in rule extract (thanks Barry)
2013-05-07 10:50:19 +02:00
Rico Sennrich
a52f0a8c4d
avoid costly copy operation in extract-rules
...
(noticeable speed-up with large number of non-terminals:
2x speed-up in benchmark with target syntax and --MaxNonTerm 5)
2013-05-03 10:48:14 +02:00
phikoehn
d19a28ae21
Merge branch 'master' of git://github.com/moses-smt/mosesdecoder
2013-05-01 19:22:00 +01:00
phikoehn
cd8915647b
support for Chris Dyer's fast-align; bug fix with sparse word translations feature; threshold pruning in filter
2013-05-01 19:20:05 +01:00
Rico Sennrich
4e87a012d0
fix two bugs with relax-parse:
...
- size of sentence was not calculated correctly
(instead, number of positions at which a subtree starts was used)
- code entered an infinitive loop sometimes; added break condition
2013-04-25 17:27:50 +02:00
phikoehn
5ba153806b
fixed kneserNey phrase probability smoothing bug reported by Česlav Przywara <ceslav@przywara.cz>
2013-03-13 17:52:24 +00:00
Barry Haddow
5f1be3217b
bugifx format of extract file for instance weighting
2013-03-07 21:40:43 +00:00
Rico Sennrich
e3ea93acb7
speed up rule extraction by factor 2 (by rewriting rule consolidation to have linear instead of quadratic complexity)
2013-02-06 13:10:38 +01:00
Barry Haddow
8db90fd2ac
instance weighting for lex reordering
2013-01-10 19:46:19 +00:00
Barry Haddow
2e8bad22e4
lex reordering scoring uses FilePiece/StringPiece
2013-01-09 17:38:48 +00:00
Barry Haddow
861792bfc5
extract can read an instance weights file.
...
Still have to parallelise.
2012-12-21 15:39:25 +00:00
Phil Williams
139148bc8f
extract-ghkm and friends: don't unescape special characters
...
Don't unescape special characters when reading XML parse trees in
extract-ghkm, extract-rules, and relax-parse.
2012-12-17 20:08:02 +00:00
Phil Williams
0ca5b8932a
extract-ghkm: tweak label collection for unknown words
...
Produce a better label set when unary rule elimination is enabled.
2012-12-17 19:43:42 +00:00
Phil Williams
fb8d20a22f
extract-ghkm: --UnknownWordMinRelFreq, --UnknownWordUniform
2012-12-17 19:02:30 +00:00
Hieu Hoang
d0cf8f47db
order of lexical probability has flipped
2012-11-22 17:37:36 +00:00
Barry Haddow
a90e1861c0
Alignments on by default for phrase-based
2012-11-15 12:35:43 +00:00
Hieu Hoang
f96b33de83
only include moses root when compiling
2012-11-14 13:43:04 +00:00
Hieu Hoang
5e3ef23cef
move moses/src/* to moses/
2012-11-12 19:56:18 +00:00
Kenneth Heafield
7d692496c3
More little jamfile changes
2012-11-12 16:57:56 +00:00
Kenneth Heafield
d74b784ad2
And pcfg-common too...
2012-11-12 16:53:42 +00:00
Kenneth Heafield
ddd3cc1d8a
Fix extract-ghkm compilation
2012-11-12 16:50:46 +00:00
Kenneth Heafield
62d37fa2b6
Refactor phrase-extract/Jamfile
2012-11-12 14:17:48 +00:00
Hieu Hoang
e75522b602
rename functions
2012-11-09 18:55:01 -05:00
Hieu Hoang
db29cc50d0
xcode
2012-11-09 18:13:20 -05:00
Kenneth Heafield
cd00219fa4
Missing dependency
2012-11-04 16:26:47 -05:00
Barry Haddow
62fa6d6f28
Feature function interface for use in scoring
2012-11-02 23:30:51 +00:00
Barry Haddow
d1d5fe4036
Remove -SentenceId (since we have -IncludeSentenceId now)
2012-10-22 22:03:43 +01:00
Barry Haddow
848aafb644
Merge remote branch 'github/master' into miramerge
...
Conflicts:
moses/src/AlignmentInfo.cpp
moses/src/AlignmentInfo.h
moses/src/ChartHypothesis.cpp
moses/src/ChartTrellisNode.cpp
moses/src/LM/Implementation.cpp
moses/src/LM/Ken.cpp
moses/src/TargetPhrase.cpp
moses/src/TargetPhrase.h
2012-10-08 17:54:59 +01:00
Phil Williams
0851a4d113
extract-ghkm: add --SentenceOffset option
...
This should behave the same as the --SentenceOffset option for
extract-rules. The extract-parallel.perl script expects the rule
extractor to have this option.
2012-10-03 20:04:09 +01:00
Barry Haddow
0a950ee9f4
Merge remote branch 'github/master' into miramerge
...
Compiles, but not tested. Had to disable relent filter. Strangely, it seems to contain the
whole of moses-cmd.
Conflicts:
Jamroot
OnDiskPt/TargetPhrase.cpp
moses-cmd/src/Main.cpp
moses/src/AlignmentInfo.cpp
moses/src/AlignmentInfo.h
moses/src/ChartTranslationOptionCollection.cpp
moses/src/ChartTranslationOptionCollection.h
moses/src/GenerationDictionary.cpp
moses/src/Jamfile
moses/src/Parameter.cpp
moses/src/PhraseDictionary.cpp
moses/src/StaticData.cpp
moses/src/StaticData.h
moses/src/TargetPhrase.h
moses/src/TranslationSystem.cpp
moses/src/TranslationSystem.h
moses/src/Word.cpp
phrase-extract/score.cpp
regression-testing/Jamfile
scripts/ems/experiment.meta
scripts/ems/experiment.perl
scripts/training/train-model.perl
2012-09-26 22:49:33 +01:00
phikoehn
28e8832a15
bug fix domain features
2012-09-25 01:22:09 +01:00
Kenneth Heafield
0cddf8a58b
Fix compilation without threads
2012-09-21 23:11:59 +01:00
Eva Hasler
21938e4d94
initialize correct variable (includeSentenceIdFlag)
2012-09-12 20:02:57 +01:00
phikoehn
5d9859ba0e
merge issues
2012-09-03 07:27:41 +01:00
phikoehn
e072a7f9a7
merge issues
2012-09-03 07:24:07 +01:00
phikoehn
0e783dc529
bug fix to enable pruned search graph output by default
2012-09-03 07:23:32 +01:00
phikoehn
d99f97297f
merges
2012-09-03 07:21:47 +01:00
Hieu Hoang
c639cdbb38
binary hiero reordering feature. Integrated into train-model.perl and experiment.perl. In the 2nd to last position in phrase table, just in front of 2.718
2012-08-28 17:01:08 +01:00
Hieu Hoang
33c03edfbb
binary hiero reordering feature. Implementation of 1 described in nist 2012. 1 if non-term is reordered wrt to other words or non-terms. 0 otherwise
2012-08-25 00:47:57 +01:00
Hieu Hoang
fa56d7861f
word alignment info for hiero grammar
2012-08-24 19:11:35 +01:00
Hieu Hoang
69fc00faf9
singleton feature in phrase table. Like similar feature in Adam's suffix array, as implemented in cdec
2012-08-24 00:54:05 +01:00
Hieu Hoang
5dbb0e66ce
option to produce rules that have boundary <s> & </s> words. Like Chris Dyer's extraction
2012-08-23 19:40:09 +01:00