Commit Graph

255 Commits

Author SHA1 Message Date
Tetsuo Kiso
6b1dfa3434 Clean up Data::loadnbest().
Add helper functions.
2012-03-07 07:01:28 +09:00
Tetsuo Kiso
d6c1abe6bb Rewrite FeatureData::setFeatureMap(); add the unit test. 2012-03-07 06:32:38 +09:00
Tetsuo Kiso
5c4e2a8c8d Use boost::scoped_ptr to avoid resource leaks. 2012-03-05 00:35:07 +09:00
Tetsuo Kiso
c8800f3822 Change the private member function in mert/Timer. 2012-03-03 23:49:17 +09:00
Tetsuo Kiso
ee5174de58 Delete assertions to check elapsed CPU time.
The accuracy of getrusage() is limited by the resolution
of software clock as described in
http://www.kernel.org/doc/man-pages/online/pages/man7/time.7.html

The assertions required a timer with microsecond accuracy.
However, we don't necessarily want the timer, and we don't
want to add some time-consuming processes to the test code because
we normally build programs again and again, which means
we want to run unit tests as quickly as possible.
2012-03-03 23:24:08 +09:00
Tetsuo Kiso
9a46c5cd7f Disable undesirable copying Timer objects. 2012-03-03 21:12:40 +09:00
Matous Machacek
f196a87763 Fix mert.cpp to work with InterpolatedScorer 2012-03-02 14:16:05 +01:00
Tetsuo Kiso
7735670a57 Disable failed assertions of TimerTest anyway.
This commit is kludgy. A better solution to the problem will be pushed.
Note that the assertions have no impact on the MERT process.
2012-02-29 12:38:02 +09:00
Tetsuo Kiso
b99ebb7a19 Fix failure of the Timer unit test. 2012-02-28 12:34:40 +09:00
Matous Machacek
ba987c94ba Support for using factors in mert and evaluator
example:
Use --factor "0|2" to use only first and third factor from nbest list and from reference.
If you use interpolated scorer, separate records with comma (e.g. --factor "0|2,1").
2012-02-28 02:27:23 +01:00
Tetsuo Kiso
6d6fb4383d Fix a mistake in a previous commit: tuning on a subset of features.
In the commit 4b6232b757,
I thought I had fixed the bug around the tuning on a subset of
features by checking whether pdim and the length of the
active features which you want to optimize in the tuning.

However, it was wrong. I should set Point::optindices
appropriately according to specified the subset.
2012-02-28 00:35:42 +09:00
Tetsuo Kiso
c3bb4c7abd Fix compiling mert: add a missed header. 2012-02-27 18:50:27 +09:00
Tetsuo Kiso
5e74e87da0 Fix memory leaks.
- The Scorer and ScoreData objects allocated by the new
  operator are now released using the ScopedVector class.

- Add 'virtual' to inherited functions from the Scorer
  class.
2012-02-27 14:30:37 +09:00
Tetsuo Kiso
04a717be2b Merge branch 'master' of github.com:moses-smt/mosesdecoder 2012-02-27 08:35:09 +09:00
Tetsuo Kiso
7093d2e2cd Change mert/Timer.
- Add a high resolution timing function to measure the
  wall-clock time by gettimeofday().

- Now the Timer class use getrusage() to measure the elapsed
  CPU time as KenLM does.

- Revive Timer::restart().

- Add Timer::ToString() for reporting the detail statistics
  as well as for debugging.

- Add a simple unit test for Timer.
2012-02-27 08:34:51 +09:00
Matous Machacek
e3f0280f27 Change of evaluator usage (see mert/evaluator --help). 2012-02-26 23:04:02 +01:00
Matous Machacek
99a98a336b Check of the number of weights in InterpolatedScorer 2012-02-26 22:52:20 +01:00
Matous Machacek
bd92b0634a Fix small bugs (info is printed to cerr) 2012-02-26 22:23:57 +01:00
Matous Machacek
e8a94a7bd2 Added interpolated scorer
example: to interpolate BLEU and CDER use --sctype=BLEU,CDER
to specify weights use --scconfig=weights:0.3+0.7

This scorer should replace MergeScorer (which requires mert-moses-multi.pl) soon.
Interpolated scorer is more universal and is used in the same way as other scorers.
2012-02-26 18:53:08 +01:00
Tetsuo Kiso
3b47348550 Cleanup the Timer class in mert. 2012-02-26 14:40:17 +09:00
Tetsuo Kiso
0c24f7e10b Remove unused members. 2012-02-26 13:58:48 +09:00
Tetsuo Kiso
c62365b419 Prefix private members with "m_". 2012-02-26 13:52:47 +09:00
Tetsuo Kiso
ff5ae511b1 Clean up ScoreStats::set(); Remove a constructor which has a string. 2012-02-26 13:44:47 +09:00
Tetsuo Kiso
9d6263d337 Remove unnecessary conversions using ostream_iterator. 2012-02-26 13:14:34 +09:00
Tetsuo Kiso
c913effe13 Clean up. 2012-02-26 13:04:27 +09:00
Tetsuo Kiso
c26e83fd09 Remove obsolete and unused logging statements. 2012-02-26 02:19:40 +09:00
Tetsuo Kiso
224c654fa5 Don't repeat calling functions many times.
Consider using constants the result if it is possible.
2012-02-26 02:12:59 +09:00
Tetsuo Kiso
669b9d9c7a Minor change the logging utility for n-gram counts.
Use std::ostream instead of directly using std::cerr.
2012-02-26 02:01:03 +09:00
Tetsuo Kiso
8e0a61d0d7 Clean up calculation effective reference length. 2012-02-26 01:54:51 +09:00
Tetsuo Kiso
c4fa8a3865 Add a more efficient member to set up ScoreStats.
- Remove unnecessary conversions.

- Add 'const' to local variables.
2012-02-26 01:41:17 +09:00
Tetsuo Kiso
2c2bd63bbd Replace string objects with const char[]. 2012-02-26 01:18:08 +09:00
Tetsuo Kiso
17f06a3250 Hide the implementation details of Ngram counts from the header. 2012-02-26 01:11:56 +09:00
Tetsuo Kiso
0c9023abc6 Clean up commented out code snippets for debugging purposes. 2012-02-25 18:14:00 +09:00
Matous Machacek
16376eabcc Fixed quadratic time when adding ScoreStats to ScoreData 2012-02-21 10:39:04 +01:00
Tetsuo Kiso
aefa6e1000 Fix a memory leak. 2012-02-20 11:04:21 +09:00
Tetsuo Kiso
c2ef7093ed Add 'virtual' to destructors. 2012-02-20 10:23:59 +09:00
Tetsuo Kiso
47ac8a474d Change the naming conventions for the guard macros; Rename TER directory.
This change might be useful to avoid duplicating the names.
The reason is that although MERT programs are standalone
applications, some header files such as data.h and
point.h have common guard macro names like "DATA_H" and
"POINT_H", and this is not good naming conventions
when you want to include external headers.
Some files actually include headers in Moses and KenLM's util.
2012-02-20 09:46:08 +09:00
Tetsuo Kiso
82da44b030 Fix typo. 2012-02-20 08:29:53 +09:00
Tetsuo Kiso
ce7b136994 Add comments; remove unused macros. 2012-02-20 08:20:44 +09:00
Tetsuo Kiso
a70925317e Put global variables in mert/util.cpp in anonymous space.
We do not allow clients to access the following variables.
Instead, use the APIs which we provide for that.

Also, remove the unused function, and fix smoke tests.
2012-02-20 08:02:23 +09:00
Tetsuo Kiso
5d1cfa0ebb Bug fix: tokenizer used in mert; add unit tests for that.
When tokenizing a string delimited by spaces (say, "9 9 8 7 ")
with Tokenize(), resulting a sequence of strings are
{"9", "9", "8", "7", "" }, which is different
from we have expected. We are not interested in empty strings.

This commit fix this issue, and add unit tests for
the tokenize functions.
2012-02-20 07:39:24 +09:00
Tetsuo Kiso
a7666735b5 Add error checking to setup 'to_optimize'.
mert will check whether the dimension and the number of
fetures are equal.
2012-02-17 09:16:10 +09:00
Tetsuo Kiso
6c003e544a Bug fix mert: when you want to optimize fewer features.
This commit is a temporary bug fix.
2012-02-17 08:25:18 +09:00
Tetsuo Kiso
819dc9e0f9 Add a utility function to FeatureData for debugging. 2012-02-17 07:27:07 +09:00
Tetsuo Kiso
c1b85b480c Delete mert/sample/README; Add smoke tests.
Replace README with a bunch of shell script
for smoke testing of MERT.

The README file was not a typical README file.
It was like a sample script to run mert and
extractor, so I renamed it as smoke tests stuff.
2012-02-17 03:53:52 +09:00
Barry Haddow
7091555cd6 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2012-02-08 17:48:19 +00:00
Barry Haddow
fa6753b0f3 Really simple sharding test 2012-02-08 17:47:54 +00:00
Barry Haddow
62d7d034bb Fix sharding bug 2012-02-08 17:11:56 +00:00
Tetsuo Kiso
905f959d83 Move funcions defined in a header into .cpp file. 2012-02-01 21:44:37 +09:00
Tetsuo Kiso
b2987337d8 Remove virtual keyword from whoami() function.
The funtion is neither inherited from Scorer nor
StatisticsBasedScorer.
2012-02-01 21:36:25 +09:00