Commit Graph

3006 Commits

Author SHA1 Message Date
Matous Machacek
ceb70ec00c Fix small bugs (info is printed to cerr) 2012-02-26 22:23:57 +01:00
Matous Machacek
fa2eb79977 Added interpolated scorer
example: to interpolate BLEU and CDER use --sctype=BLEU,CDER
to specify weights use --scconfig=weights:0.3+0.7

This scorer should replace MergeScorer (which requires mert-moses-multi.pl) soon.
Interpolated scorer is more universal and is used in the same way as other scorers.
2012-02-26 18:53:08 +01:00
Tetsuo Kiso
4291677066 Remove obsolete and unused logging statements. 2012-02-26 02:19:40 +09:00
Tetsuo Kiso
82c948e0d3 Don't repeat calling functions many times.
Consider using constants the result if it is possible.
2012-02-26 02:12:59 +09:00
Tetsuo Kiso
37c19feebd Minor change the logging utility for n-gram counts.
Use std::ostream instead of directly using std::cerr.
2012-02-26 02:01:03 +09:00
Tetsuo Kiso
4a63846f82 Clean up calculation effective reference length. 2012-02-26 01:54:51 +09:00
Tetsuo Kiso
51f86de1b6 Add a more efficient member to set up ScoreStats.
- Remove unnecessary conversions.

- Add 'const' to local variables.
2012-02-26 01:41:17 +09:00
Tetsuo Kiso
28cc3631cb Replace string objects with const char[]. 2012-02-26 01:18:08 +09:00
Tetsuo Kiso
40d0ff0434 Hide the implementation details of Ngram counts from the header. 2012-02-26 01:11:56 +09:00
Tetsuo Kiso
a455b06f2f Clean up commented out code snippets for debugging purposes. 2012-02-25 18:14:00 +09:00
Ondrej Bojar
5f84b6e074 accept gzipped input files (tested for non-factored phrase-based) 2012-02-24 18:02:50 +01:00
Hieu Hoang
dbb7980a72 cygwin 2012-02-24 14:18:22 +00:00
Hieu Hoang
b63308f163 zlib changes. More strongly type gzFile variables. HAVE_ZLIB prob no longer works 2012-02-24 00:04:08 +00:00
Hieu Hoang
70d4e01bde zlib changes 2012-02-23 23:54:07 +00:00
Hieu Hoang
9365723fb1 compile error when libs are updated by macports 2012-02-23 22:34:30 +00:00
Hieu Hoang
de9eeab7e9 mac osx compatible split & sort 2012-02-23 13:26:19 +00:00
phikoehn
d3e17f0ebd fix bug with < and > 2012-02-22 00:49:29 +00:00
Matous Machacek
85f9303bd1 Fixed quadratic time when adding ScoreStats to ScoreData 2012-02-21 10:39:04 +01:00
Tetsuo Kiso
fa43a88d46 Fix a memory leak. 2012-02-20 11:04:21 +09:00
Tetsuo Kiso
e749924706 Add 'virtual' to destructors. 2012-02-20 10:23:59 +09:00
Tetsuo Kiso
8c3b82e596 Change the naming conventions for the guard macros; Rename TER directory.
This change might be useful to avoid duplicating the names.
The reason is that although MERT programs are standalone
applications, some header files such as data.h and
point.h have common guard macro names like "DATA_H" and
"POINT_H", and this is not good naming conventions
when you want to include external headers.
Some files actually include headers in Moses and KenLM's util.
2012-02-20 09:46:08 +09:00
Tetsuo Kiso
94888b258d Fix typo. 2012-02-20 08:29:53 +09:00
Tetsuo Kiso
232e514774 Add comments; remove unused macros. 2012-02-20 08:20:44 +09:00
Tetsuo Kiso
faab4b214d Put global variables in mert/util.cpp in anonymous space.
We do not allow clients to access the following variables.
Instead, use the APIs which we provide for that.

Also, remove the unused function, and fix smoke tests.
2012-02-20 08:02:23 +09:00
Tetsuo Kiso
8c7dfe04e7 Bug fix: tokenizer used in mert; add unit tests for that.
When tokenizing a string delimited by spaces (say, "9 9 8 7 ")
with Tokenize(), resulting a sequence of strings are
{"9", "9", "8", "7", "" }, which is different
from we have expected. We are not interested in empty strings.

This commit fix this issue, and add unit tests for
the tokenize functions.
2012-02-20 07:39:24 +09:00
Tetsuo Kiso
4b6232b757 Add error checking to setup 'to_optimize'.
mert will check whether the dimension and the number of
fetures are equal.
2012-02-17 09:16:10 +09:00
Tetsuo Kiso
c5e7e4cea7 Bug fix mert: when you want to optimize fewer features.
This commit is a temporary bug fix.
2012-02-17 08:25:18 +09:00
Tetsuo Kiso
47b535ee0a Add a utility function to FeatureData for debugging. 2012-02-17 07:27:07 +09:00
Tetsuo Kiso
0ae4ad810f Merge branch 'master' of github.com:moses-smt/mosesdecoder 2012-02-17 03:59:58 +09:00
Tetsuo Kiso
91645503e8 Delete mert/sample/README; Add smoke tests.
Replace README with a bunch of shell script
for smoke testing of MERT.

The README file was not a typical README file.
It was like a sample script to run mert and
extractor, so I renamed it as smoke tests stuff.
2012-02-17 03:53:52 +09:00
Kenneth Heafield
ef4ae635d1 Hopefully safe to assume people don't need to clean up after autotools anymore 2012-02-16 12:46:24 -05:00
Philipp Koehn
faea0c3ea3 bug fixes 2012-02-16 05:14:56 +00:00
Hieu Hoang
7073f7d891 bug fix by Guchun Zhang 2012-02-16 10:28:09 +07:00
Hieu Hoang
7f6f8a99f9 bug fix by Guchun Zhang 2012-02-16 10:26:48 +07:00
Kenneth Heafield
d62f301345 Optional header installation 2012-02-13 14:31:37 -05:00
Ondrej Bojar
1363583c4f gzip output phrase-based ttable by default 2012-02-12 00:23:02 +01:00
Hieu Hoang
e85662045d compile error when using ORLM and SRI 2012-02-09 21:38:31 +07:00
Hieu Hoang
53b41f7c45 parallel extract 2012-02-09 18:24:49 +07:00
Mark Fishel
96aa6d8f56 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2012-02-09 10:11:21 +01:00
Mark Fishel
527e892d7f put the e2f table half sorting back into the threaded section for occasional minor speedup 2012-02-09 10:09:44 +01:00
Hieu Hoang
8d2663acd9 output how many threads specified 2012-02-09 15:40:11 +07:00
Mark Fishel
3c455c45f1 Made the forking phrase scoring fall back to non-forking behaviour if the --parallel switch is off 2012-02-09 09:34:00 +01:00
Kenneth Heafield
4e6ecc6457 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2012-02-08 13:29:57 -05:00
Kenneth Heafield
3c7271220d Compile ORLM. The existing code should to be refactored. 2012-02-08 13:29:16 -05:00
Barry Haddow
69afc63fb0 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2012-02-08 17:48:19 +00:00
Barry Haddow
757f08a141 Really simple sharding test 2012-02-08 17:47:54 +00:00
Barry Haddow
752724594e Fix sharding bug 2012-02-08 17:11:56 +00:00
Hieu Hoang
b8c1c53e2b visual studio 2012-02-08 17:57:36 +07:00
Hieu Hoang
fbdab66587 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2012-02-08 17:26:29 +07:00
Hieu Hoang
ba5316e2e8 parallel scoring 2012-02-08 17:25:42 +07:00