Commit Graph

55 Commits

Author SHA1 Message Date
Hieu Hoang
f35750bc08 beautify 2013-07-04 20:19:51 +01:00
Sara Stymne
b2eb42ed12 added document level Bleu scoring to mert 2013-07-03 14:03:58 +02:00
Hieu Hoang
6249432407 beautify 2013-05-29 18:16:15 +01:00
Barry Haddow
9ca364fb22 Implement brevity penalty smoothing for PRO
As in Nakov et al (Coling 2012)
2013-02-18 11:11:20 +00:00
Tetsuo Kiso
8fdec9bf30 Use boost::unordered_map instead of std::map.
For storing the word vocabulary used in computation of
BLEU scores. This change will reduce the running time
of extractor about 2-3 seconds (9% reduction).
2012-12-07 05:12:24 +09:00
Tetsuo Kiso
cccfb9a0c9 Using namespace std in a header file pollutes the global namespace.
Using directives should be put into the implementation files.
2012-11-05 00:43:36 +09:00
Barry Haddow
2b4e61d826 Merge branch 'trunk' into miramerge
Compiles, not tested.

Conflicts:
	Jamroot
	OnDiskPt/PhraseNode.h
	OnDiskPt/TargetPhrase.cpp
	OnDiskPt/TargetPhrase.h
	OnDiskPt/TargetPhraseCollection.cpp
	mert/BleuScorer.cpp
	mert/Data.cpp
	mert/FeatureData.cpp
	moses-chart-cmd/src/Main.cpp
	moses/src/AlignmentInfo.h
	moses/src/ChartManager.cpp
	moses/src/LM/Ken.cpp
	moses/src/LM/Ken.h
	moses/src/LMList.h
	moses/src/LexicalReordering.h
	moses/src/PhraseDictionaryTree.h
	moses/src/ScoreIndexManager.h
	moses/src/StaticData.h
	moses/src/TargetPhrase.h
	moses/src/Word.cpp
	scripts/ems/experiment.meta
	scripts/ems/experiment.perl
	scripts/training/train-model.perl
2012-07-17 13:36:50 +01:00
Hieu Hoang
e3dd3a8d2c namespace all classes in mert directory 2012-06-30 20:23:45 +01:00
Hieu Hoang
0cb63edcb9 merge Lexi Birch's LRScore from mert_mtm5 branch. Compiles and run. Hack, must double check with barry or lexi 2012-06-23 22:51:48 -04:00
Eva Hasler
e1c1a5343c merge 2012-06-07 11:16:52 +01:00
Eva Hasler
6a6a35c65e fix start weights in experiment.perl, add hypothesis queue for picking hope and fear translations, add variations to 1slack formulation 2012-06-01 01:49:42 +01:00
Colin Cherry
fd577d7a65 Batch k-best MIRA is written and integrated into mert-moses.pl
Regression tests all check out, and kbmira seems to work fine
on a Hansard French->English task.

HypPackEnumerator class may be of interest to pro.cpp and future
optimizers, as it abstracts a lot of the boilerplate involved in
enumerating multiple k-best lists.

MiraWeightVector is not really mira-specific - just a weight vector
that enables efficient averaging. Could be useful to a perceptron
as well. Same goes for MiraFeatureVector.

Interaction with sparse features is written, but untested.
2012-05-29 13:38:57 -04:00
Eva Hasler
30deedde9f changed permission, everything changed.. 2012-05-10 18:54:24 +01:00
Tetsuo Kiso
9c9d88a78a Avoid "using namespace std" in headers. 2012-05-10 07:51:05 +09:00
Eva
6f39ad0b3e test 2012-04-28 23:11:30 -07:00
Tetsuo Kiso
d034eeb703 Add test cases for BLEU and sentence-level BLEU+1.
- Move a definition of sentenceLevelBleuPlusOne() from pro.cpp
  to BleuScorer.cpp.
- Add check for the length of an input vector.
2012-04-07 01:02:32 +09:00
Tetsuo Kiso
eaa0ab486a Add a test case for BLEU's clipped counts.
- Make BleuScorer::setReferenceFiles() more testable by
  adding OpenReference() and OpenReferenceStream().
2012-04-04 22:33:30 +09:00
Tetsuo Kiso
f686e8771a Add some functions to BleuScorer for unit testing.
This commit also includes
- Fix typo.
- Fix indentations.
- Add 'const' to Scorer::applyFactors().
2012-03-19 22:45:15 +09:00
Tetsuo Kiso
6b95a19eda Create Reference class to clean up BleuScorer.
- Add an unit test for Reference.
- Move functions to calculate the reference length from
  BleuScorer to Reference.
2012-03-18 05:58:40 +09:00
Tetsuo Kiso
fba01c7cdf Create a header file for NgramCounts class.
The reason is that we want to add the unit test.
2012-03-14 22:14:11 +09:00
Tetsuo Kiso
ed6e6f00b1 Minor change for calculating BLEU.
To avoid defining the similar variables twice to calculate
document-wise BLEU and sentence-wise BLEU scores.
2012-03-10 02:49:31 +09:00
Tetsuo Kiso
669b9d9c7a Minor change the logging utility for n-gram counts.
Use std::ostream instead of directly using std::cerr.
2012-02-26 02:01:03 +09:00
Tetsuo Kiso
8e0a61d0d7 Clean up calculation effective reference length. 2012-02-26 01:54:51 +09:00
Tetsuo Kiso
17f06a3250 Hide the implementation details of Ngram counts from the header. 2012-02-26 01:11:56 +09:00
Tetsuo Kiso
0c9023abc6 Clean up commented out code snippets for debugging purposes. 2012-02-25 18:14:00 +09:00
Tetsuo Kiso
47ac8a474d Change the naming conventions for the guard macros; Rename TER directory.
This change might be useful to avoid duplicating the names.
The reason is that although MERT programs are standalone
applications, some header files such as data.h and
point.h have common guard macro names like "DATA_H" and
"POINT_H", and this is not good naming conventions
when you want to include external headers.
Some files actually include headers in Moses and KenLM's util.
2012-02-20 09:46:08 +09:00
Tetsuo Kiso
b19e7777ce Add prefix 'm_' to private and protected members in Scorer classes. 2012-02-01 20:54:20 +09:00
Tetsuo Kiso
30fa97e404 Move reference length type into a private member of BleuScorer.
The reason is that the type is used as internal purpose.
2012-02-01 20:24:48 +09:00
Tetsuo Kiso
4d189eb14d Fix a typedef for comparing N-grams.
Declared const_iterator was not *const* actually.
2011-11-30 00:27:57 +09:00
Tetsuo Kiso
a639116847 Fix a typedef for comparing N-grams.
Declared const_iterator was not *const* actually.
2011-11-30 00:27:57 +09:00
Tetsuo Kiso
29c16d252a Minimize using #include headers in headers.
Should use it in .cpp files.
2011-11-14 15:15:30 +09:00
Tetsuo Kiso
54b3b846c7 Add const member functions in Scorer classes. 2011-11-12 10:58:14 +09:00
Tetsuo Kiso
00b8c6d768 Use const Scorer::calculateScore(). 2011-11-12 10:40:54 +09:00
Tetsuo Kiso
d776281b8b Simple refactoring of BLEU scorer. 2011-11-12 10:21:08 +09:00
Tetsuo Kiso
43beb88df5 Fix constructors of scorer classes and optimizer classes.
Using public const members is not good idea.
It should be initialized in private by constructors.
2011-11-12 10:16:31 +09:00
Tetsuo Kiso
dfb714296f Add 'explicit' for constructors with one argument. 2011-11-12 09:51:27 +09:00
Tetsuo Kiso
ce9a628ed0 Remove unnecessary semicolons used in end of member functions. 2011-11-12 09:40:01 +09:00
Tetsuo Kiso
68315d6407 Fix class, function, and implementation comments format.
Functions comments should be placed in their declarations.
2011-11-12 08:58:23 +09:00
Tetsuo Kiso
4f6d022fe7 Add comments to mark the end of #define guards. 2011-11-12 07:59:50 +09:00
Tetsuo Kiso
087756b8c3 Fix memory leaks in extractor. 2011-11-11 20:02:26 +09:00
servan
f223f5a276 M mert/TerScorer.cpp
M    mert/BleuScorer.h
M    mert/ScorerFactory.h
M    mert/Scorer.h
M    mert/PerScorer.h
M    mert/TerScorer.h
M    mert/Makefile.am
AM   scripts/training/mert-moses-multi.pl


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4299 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-05 13:36:17 +00:00
machacekmatous
642e8dce95 Added evaluator to MERT directory. This tool computes a metric score for given candidate and reference files:
evaluator --sctype PER --reference ref.file --candidate cand.file

usage: evaluator [options] --reference ref1[,ref2[,ref3...]] --candidate cand1[,cand2[,cand3...]]
[--sctype|-s] the scorer type (default BLEU)
[--scconfig|-c] configuration string passed to scorer
        This is of the form NAME1:VAL1,NAME2:VAL2 etc
[--reference|-R] comma separated list of reference files
[--candidate|-C] comma separated list of candidate files
[--bootstrap|-b] number of booststraped samples (default 0 - no bootstraping)
[--rseed|-r] the random seed for bootstraping (defaults to system clock)
[--help|-h] print this message and exit


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4153 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-20 15:25:19 +00:00
hieuhoang1972
148c1e8305 run beautify.perl. Consistent formatting for .h & .cpp files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3899 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-24 12:42:19 +00:00
nicolabertoldi
0393183eb4 mert software now works with different reference length policies: shortest, average, closest (default) and with case information (default is preserving case). Pay attention that both defaults are different from the previous version (which were shortest reflen and case-insensitive).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2459 1f5c12ca-751b-0410-a591-d2e778427230
2009-08-05 15:38:35 +00:00
phkoehn
1b5d99ad26 added headers for standard compliance (gcc 4.3 on 64 bit linux)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1905 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-16 21:14:38 +00:00
bhaddow
83f234cf17 Implementation of Cer et al mert regularisation. Use with argument such
as --scconfig regtype:min,regwin:3 in extractor and mert. Only tested
on toy example so far.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1860 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-24 19:27:18 +00:00
nicolabertoldi
e94834012d added facilities to read and write score statistics in binary format
moved facilities for feature names in FeatureData object


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1824 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-05 17:03:54 +00:00
nicolabertoldi
291260abf7 - made output more compliant with old version
- added PerSCorer.h and BleuScorer.h
- stored feature names
- fixed bug about output of best Point


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1796 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-27 16:50:52 +00:00
bhaddow
c0643d47f2 Add scorer factory. Fix compile error in Optimizer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1706 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-15 16:03:49 +00:00
bhaddow
f320cf5174 Refactor PerScorer and BleuScorer to remove common code
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1704 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-15 14:48:11 +00:00