Commit Graph

60 Commits

Author SHA1 Message Date
Barry Haddow
2b4e61d826 Merge branch 'trunk' into miramerge
Compiles, not tested.

Conflicts:
	Jamroot
	OnDiskPt/PhraseNode.h
	OnDiskPt/TargetPhrase.cpp
	OnDiskPt/TargetPhrase.h
	OnDiskPt/TargetPhraseCollection.cpp
	mert/BleuScorer.cpp
	mert/Data.cpp
	mert/FeatureData.cpp
	moses-chart-cmd/src/Main.cpp
	moses/src/AlignmentInfo.h
	moses/src/ChartManager.cpp
	moses/src/LM/Ken.cpp
	moses/src/LM/Ken.h
	moses/src/LMList.h
	moses/src/LexicalReordering.h
	moses/src/PhraseDictionaryTree.h
	moses/src/ScoreIndexManager.h
	moses/src/StaticData.h
	moses/src/TargetPhrase.h
	moses/src/Word.cpp
	scripts/ems/experiment.meta
	scripts/ems/experiment.perl
	scripts/training/train-model.perl
2012-07-17 13:36:50 +01:00
Hieu Hoang
e3dd3a8d2c namespace all classes in mert directory 2012-06-30 20:23:45 +01:00
Hieu Hoang
0cb63edcb9 merge Lexi Birch's LRScore from mert_mtm5 branch. Compiles and run. Hack, must double check with barry or lexi 2012-06-23 22:51:48 -04:00
Eva Hasler
e1c1a5343c merge 2012-06-07 11:16:52 +01:00
Eva Hasler
6a6a35c65e fix start weights in experiment.perl, add hypothesis queue for picking hope and fear translations, add variations to 1slack formulation 2012-06-01 01:49:42 +01:00
Colin Cherry
fd577d7a65 Batch k-best MIRA is written and integrated into mert-moses.pl
Regression tests all check out, and kbmira seems to work fine
on a Hansard French->English task.

HypPackEnumerator class may be of interest to pro.cpp and future
optimizers, as it abstracts a lot of the boilerplate involved in
enumerating multiple k-best lists.

MiraWeightVector is not really mira-specific - just a weight vector
that enables efficient averaging. Could be useful to a perceptron
as well. Same goes for MiraFeatureVector.

Interaction with sparse features is written, but untested.
2012-05-29 13:38:57 -04:00
Eva Hasler
30deedde9f changed permission, everything changed.. 2012-05-10 18:54:24 +01:00
Tetsuo Kiso
9c9d88a78a Avoid "using namespace std" in headers. 2012-05-10 07:51:05 +09:00
Matous Machacek
440650bd6e Added support for external unix filters to preprocess sentences in mert and evaluator 2012-05-09 19:21:41 +02:00
Eva
6c2a58a48e clean up mira, add sampling from hope/model/fear 2012-04-29 21:29:18 -07:00
Eva
6f39ad0b3e test 2012-04-28 23:11:30 -07:00
Tetsuo Kiso
d034eeb703 Add test cases for BLEU and sentence-level BLEU+1.
- Move a definition of sentenceLevelBleuPlusOne() from pro.cpp
  to BleuScorer.cpp.
- Add check for the length of an input vector.
2012-04-07 01:02:32 +09:00
Tetsuo Kiso
eaa0ab486a Add a test case for BLEU's clipped counts.
- Make BleuScorer::setReferenceFiles() more testable by
  adding OpenReference() and OpenReferenceStream().
2012-04-04 22:33:30 +09:00
Tetsuo Kiso
8987fed667 Add thread unsafe Singleton class.
- Add Vocabulary factory and the unit test.
- Remove Scorer::ClearVocabulary().
2012-03-20 05:49:10 +09:00
Tetsuo Kiso
525f06452c Change the Encoder class to Vocabulary.
- Introduce the namespace to avoid naming collisions. The class name
  is used in KenLM.
- Add the unit test.
2012-03-20 03:43:04 +09:00
Tetsuo Kiso
2b28072f7a Move Encoder class from Scorer.h to Ngram.h.
To add unit tests.
2012-03-19 23:21:02 +09:00
Tetsuo Kiso
f686e8771a Add some functions to BleuScorer for unit testing.
This commit also includes
- Fix typo.
- Fix indentations.
- Add 'const' to Scorer::applyFactors().
2012-03-19 22:45:15 +09:00
Tetsuo Kiso
6b95a19eda Create Reference class to clean up BleuScorer.
- Add an unit test for Reference.
- Move functions to calculate the reference length from
  BleuScorer to Reference.
2012-03-18 05:58:40 +09:00
Tetsuo Kiso
c6536a134b Clean up BleuScorer. 2012-03-14 22:44:51 +09:00
Tetsuo Kiso
5007f129d8 Clean up BleuScorer with lookup(). 2012-03-14 22:41:29 +09:00
Tetsuo Kiso
fba01c7cdf Create a header file for NgramCounts class.
The reason is that we want to add the unit test.
2012-03-14 22:14:11 +09:00
Tetsuo Kiso
ed6e6f00b1 Minor change for calculating BLEU.
To avoid defining the similar variables twice to calculate
document-wise BLEU and sentence-wise BLEU scores.
2012-03-10 02:49:31 +09:00
Matous Machacek
ba987c94ba Support for using factors in mert and evaluator
example:
Use --factor "0|2" to use only first and third factor from nbest list and from reference.
If you use interpolated scorer, separate records with comma (e.g. --factor "0|2,1").
2012-02-28 02:27:23 +01:00
Tetsuo Kiso
c26e83fd09 Remove obsolete and unused logging statements. 2012-02-26 02:19:40 +09:00
Tetsuo Kiso
224c654fa5 Don't repeat calling functions many times.
Consider using constants the result if it is possible.
2012-02-26 02:12:59 +09:00
Tetsuo Kiso
669b9d9c7a Minor change the logging utility for n-gram counts.
Use std::ostream instead of directly using std::cerr.
2012-02-26 02:01:03 +09:00
Tetsuo Kiso
8e0a61d0d7 Clean up calculation effective reference length. 2012-02-26 01:54:51 +09:00
Tetsuo Kiso
c4fa8a3865 Add a more efficient member to set up ScoreStats.
- Remove unnecessary conversions.

- Add 'const' to local variables.
2012-02-26 01:41:17 +09:00
Tetsuo Kiso
2c2bd63bbd Replace string objects with const char[]. 2012-02-26 01:18:08 +09:00
Tetsuo Kiso
17f06a3250 Hide the implementation details of Ngram counts from the header. 2012-02-26 01:11:56 +09:00
Tetsuo Kiso
0c9023abc6 Clean up commented out code snippets for debugging purposes. 2012-02-25 18:14:00 +09:00
Tetsuo Kiso
17e864e446 Create private class to encapssulate encoding process.
Instead of using typedefs inside a class only,
it might be better to create a private class to do same things.
2012-02-01 21:19:25 +09:00
Tetsuo Kiso
b19e7777ce Add prefix 'm_' to private and protected members in Scorer classes. 2012-02-01 20:54:20 +09:00
Tetsuo Kiso
30fa97e404 Move reference length type into a private member of BleuScorer.
The reason is that the type is used as internal purpose.
2012-02-01 20:24:48 +09:00
Tetsuo Kiso
3ef03a77c4 Change casts to C++ style casts. 2012-02-01 18:13:00 +09:00
Tetsuo Kiso
142342f8be Change casts to C++ style casts, and delete unnecessary casts. 2012-02-01 17:17:58 +09:00
Tetsuo Kiso
2fde1cab0e Add missing headers. 2011-11-14 19:52:21 +09:00
Tetsuo Kiso
29c16d252a Minimize using #include headers in headers.
Should use it in .cpp files.
2011-11-14 15:15:30 +09:00
Tetsuo Kiso
00b8c6d768 Use const Scorer::calculateScore(). 2011-11-12 10:40:54 +09:00
Tetsuo Kiso
d776281b8b Simple refactoring of BLEU scorer. 2011-11-12 10:21:08 +09:00
Tetsuo Kiso
43beb88df5 Fix constructors of scorer classes and optimizer classes.
Using public const members is not good idea.
It should be initialized in private by constructors.
2011-11-12 10:16:31 +09:00
Tetsuo Kiso
664ffe0130 Fix indentation. 2011-11-12 09:24:19 +09:00
Tetsuo Kiso
68315d6407 Fix class, function, and implementation comments format.
Functions comments should be placed in their declarations.
2011-11-12 08:58:23 +09:00
Tetsuo Kiso
087756b8c3 Fix memory leaks in extractor. 2011-11-11 20:02:26 +09:00
hieuhoang1972
148c1e8305 run beautify.perl. Consistent formatting for .h & .cpp files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3899 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-24 12:42:19 +00:00
nicolabertoldi
0393183eb4 mert software now works with different reference length policies: shortest, average, closest (default) and with case information (default is preserving case). Pay attention that both defaults are different from the previous version (which were shortest reflen and case-insensitive).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2459 1f5c12ca-751b-0410-a591-d2e778427230
2009-08-05 15:38:35 +00:00
bhaddow
83f234cf17 Implementation of Cer et al mert regularisation. Use with argument such
as --scconfig regtype:min,regwin:3 in extractor and mert. Only tested
on toy example so far.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1860 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-24 19:27:18 +00:00
nicolabertoldi
e94834012d added facilities to read and write score statistics in binary format
moved facilities for feature names in FeatureData object


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1824 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-05 17:03:54 +00:00
nicolabertoldi
291260abf7 - made output more compliant with old version
- added PerSCorer.h and BleuScorer.h
- stored feature names
- fixed bug about output of best Point


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1796 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-27 16:50:52 +00:00
nicolabertoldi
c9593648bb change from int to unsigned where needed
add some debugging output (to remove later)

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1794 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-23 11:48:16 +00:00