Tetsuo Kiso
2a3c9fc679
Further optimization for extractor.
...
Fixes inefficient updating N-gram counts.
NOTE: Using '--binary' option (this option is not enabled by default yet)
for saving outputs would lead to significant speed up.
2012-12-07 08:45:47 +09:00
Hieu Hoang
5a783f166e
Merge branch 'master' into weight-new
2012-12-06 22:05:14 +00:00
Tetsuo Kiso
8fdec9bf30
Use boost::unordered_map instead of std::map.
...
For storing the word vocabulary used in computation of
BLEU scores. This change will reduce the running time
of extractor about 2-3 seconds (9% reduction).
2012-12-07 05:12:24 +09:00
Tetsuo Kiso
6c04c4ad9c
Add more tests to the Data class.
2012-12-07 02:46:59 +09:00
Tetsuo Kiso
c7f6e38326
Use FilePiece to load N-best lists.
...
Since FilePiece is friendly with StringPiece.
2012-12-07 02:39:02 +09:00
Hieu Hoang
3d6d53bf49
delete hardcoded if() statements to show scores in n-best list. Excluded UnknownWordPenalty and made sure PhraseModel & Generation is in particular order
2012-12-06 17:28:56 +00:00
Hieu Hoang
08ca44b34e
delete hardcoded if() statements for show weights. Excluded UnknownWordPenalty and made sure PhraseModel & Generation is in particular order
2012-12-06 17:13:00 +00:00
Hieu Hoang
28b70a5697
delete hardcoded if() statements for show weights. Excluded UnknownWordPenalty and made sure PhraseModel & Generation is in particular order
2012-12-06 16:59:54 +00:00
Tetsuo Kiso
38e145e556
Use util::TokenIter to tokenize n-best lists.
...
Reduce creating std::string objects, too. In both ScoreArray
and FeatureArray classes, the private members to track sentence
indices (namely, "m_index") were unnecessarily declared as
std::string, but it's better to directly declare them as 'int'.
2012-12-07 01:39:22 +09:00
Hieu Hoang
e3def0bc78
convert all other weight-* to [weight]
2012-12-06 16:19:18 +00:00
Hieu Hoang
af459277b8
correct name of syntactic LM
2012-12-06 15:23:48 +00:00
Hieu Hoang
eb12e4c808
Merge branch 'master' into weight-new
2012-12-06 14:48:20 +00:00
Hieu Hoang
6b58f88df9
Works with multiple phrase-tables
2012-12-06 14:46:52 +00:00
Tetsuo Kiso
cd3fb3b831
Untabify.
2012-12-06 23:46:22 +09:00
Tetsuo Kiso
ac045a11c1
Speed up N-gram counts when running extractor.
...
By replacing std::map with boost::unordered_map.
Runtime of extractor on 100-best lists of 2679 sentences:
Before:
real 0m35.314s
user 0m34.030s
sys 0m1.280s
Ater:
real 0m26.729s
user 0m25.420s
sys 0m1.310s
2012-12-06 22:08:33 +09:00
Hieu Hoang
da9cd0e3aa
clean up weights code for confusion networks & lattices. Works, except for multiple phrase-tables or factors
2012-12-05 20:21:33 +00:00
Hieu Hoang
b8d4c64d6d
deprecate -translation-systems
2012-12-05 17:58:45 +00:00
Hieu Hoang
9f767d4eba
lexical reordering model works with new weight setup
2012-12-05 17:19:10 +00:00
Hieu Hoang
768d165600
works ok for plain phrase-based decoding. No lexical reordering model
2012-12-05 17:12:01 +00:00
Hieu Hoang
4f3805a0d7
change \!UnknownWordPenalty to UnknownWordPenalty
2012-12-05 11:58:21 +00:00
Hieu Hoang
cb270a76be
all regression tests passed
2012-12-04 18:39:06 +00:00
Hieu Hoang
08b508d686
indentation
2012-12-04 17:54:39 +00:00
Hieu Hoang
9fe742ce52
get rid function GetScoreProducerWeightShortName(). Fails 1 regression test
2012-12-04 17:09:23 +00:00
Hieu Hoang
33105a7ba7
get rid of int argument from GetScoreProducerWeightShortName()
2012-12-04 13:08:00 +00:00
Hieu Hoang
55f65c3104
race condition in chart decoding with -T arg
2012-12-03 14:57:33 +00:00
phikoehn
ab2effb6fe
train MML in-/out-of-domain language models with same vocabulary
2012-12-01 13:46:59 +00:00
phikoehn
269883fedd
Merge branch 'master' of git://github.com/moses-smt/mosesdecoder
2012-12-01 13:45:00 +00:00
phikoehn
0c5d000192
my change to weight-wt
2012-12-01 13:44:57 +00:00
Marcin Junczys-Dowmunt
205cea8644
Allow .minlexr suffix and bugfix
2012-12-01 00:38:20 +01:00
Eva Hasler
650d31fe73
don't need to specify weight-wt
2012-11-30 18:04:50 +00:00
Hieu Hoang
a07f71d095
race condition on letter sed cache. Requires locking
2012-11-30 17:15:32 +00:00
Hieu Hoang
7abb3c878a
remove locking. Make wordIndex variable local
2012-11-30 13:50:59 +00:00
Hieu Hoang
5fd9cbb529
delete reference to numpy. Doesn't need it
2012-11-30 10:28:51 +00:00
Hieu Hoang
017bbe78e8
forgotten misc programs for Compact pt
2012-11-30 09:49:36 +00:00
phikoehn
338b7656a6
ooops
2012-11-30 07:36:59 +00:00
phikoehn
84cb04c05a
fixes and extensions to modified Moore-Lewis filtering, now works with domain features
2012-11-30 07:28:31 +00:00
phikoehn
1f7ee0e6c5
change of settings for sigtest filtering
2012-11-29 23:44:10 +00:00
Hieu Hoang
d4ead15066
fuzzy match phrase-table is multi-threaded
2012-11-29 15:27:38 +00:00
Hieu Hoang
9aad7c65c9
move CompactPt to TranslationModel/
2012-11-27 18:04:01 +00:00
Hieu Hoang
152064086f
Merge branch 'master' of github.com:moses-smt/mosesdecoder
2012-11-27 17:33:42 +00:00
Hieu Hoang
b317ac1a34
compile error on misc programs
2012-11-27 17:33:04 +00:00
Hieu Hoang
bc1e96730d
move CKY+Parser to TranslationModel/
2012-11-27 17:23:31 +00:00
Hieu Hoang
ae8a48b022
move Score3Parser to TranslationModel/
2012-11-27 17:09:23 +00:00
Hieu Hoang
1aae9aa23c
move RuleTable to TranslationModel/
2012-11-27 16:57:23 +00:00
Hieu Hoang
6bf2870f18
move the rest of DynSA to TranslationModel/
2012-11-27 16:31:42 +00:00
Hieu Hoang
4d8e4ae6d8
move DynSAInclude to TranslationModel/
2012-11-27 16:16:30 +00:00
Barry Haddow
f0e12912e7
mml-score.py. Support for combining with domain features.
2012-11-27 15:58:55 +00:00
Hieu Hoang
75108c0aaf
minor debug messages
2012-11-27 15:39:08 +00:00
Hieu Hoang
0b54d32038
move fuzzy-match to TranslationModel/
2012-11-27 15:36:24 +00:00
Hieu Hoang
59449f2925
make TranslationModel subdirectory and move files from moses/ into it
2012-11-27 15:08:31 +00:00