Barry Haddow
2b4e61d826
Merge branch 'trunk' into miramerge
...
Compiles, not tested.
Conflicts:
Jamroot
OnDiskPt/PhraseNode.h
OnDiskPt/TargetPhrase.cpp
OnDiskPt/TargetPhrase.h
OnDiskPt/TargetPhraseCollection.cpp
mert/BleuScorer.cpp
mert/Data.cpp
mert/FeatureData.cpp
moses-chart-cmd/src/Main.cpp
moses/src/AlignmentInfo.h
moses/src/ChartManager.cpp
moses/src/LM/Ken.cpp
moses/src/LM/Ken.h
moses/src/LMList.h
moses/src/LexicalReordering.h
moses/src/PhraseDictionaryTree.h
moses/src/ScoreIndexManager.h
moses/src/StaticData.h
moses/src/TargetPhrase.h
moses/src/Word.cpp
scripts/ems/experiment.meta
scripts/ems/experiment.perl
scripts/training/train-model.perl
2012-07-17 13:36:50 +01:00
Hieu Hoang
e3dd3a8d2c
namespace all classes in mert directory
2012-06-30 20:23:45 +01:00
Hieu Hoang
0cb63edcb9
merge Lexi Birch's LRScore from mert_mtm5 branch. Compiles and run. Hack, must double check with barry or lexi
2012-06-23 22:51:48 -04:00
Eva Hasler
e1c1a5343c
merge
2012-06-07 11:16:52 +01:00
Eva Hasler
6a6a35c65e
fix start weights in experiment.perl, add hypothesis queue for picking hope and fear translations, add variations to 1slack formulation
2012-06-01 01:49:42 +01:00
Colin Cherry
fd577d7a65
Batch k-best MIRA is written and integrated into mert-moses.pl
...
Regression tests all check out, and kbmira seems to work fine
on a Hansard French->English task.
HypPackEnumerator class may be of interest to pro.cpp and future
optimizers, as it abstracts a lot of the boilerplate involved in
enumerating multiple k-best lists.
MiraWeightVector is not really mira-specific - just a weight vector
that enables efficient averaging. Could be useful to a perceptron
as well. Same goes for MiraFeatureVector.
Interaction with sparse features is written, but untested.
2012-05-29 13:38:57 -04:00
Eva Hasler
30deedde9f
changed permission, everything changed..
2012-05-10 18:54:24 +01:00
Tetsuo Kiso
9c9d88a78a
Avoid "using namespace std" in headers.
2012-05-10 07:51:05 +09:00
Matous Machacek
440650bd6e
Added support for external unix filters to preprocess sentences in mert and evaluator
2012-05-09 19:21:41 +02:00
Eva
6c2a58a48e
clean up mira, add sampling from hope/model/fear
2012-04-29 21:29:18 -07:00
Eva
6f39ad0b3e
test
2012-04-28 23:11:30 -07:00
Tetsuo Kiso
d034eeb703
Add test cases for BLEU and sentence-level BLEU+1.
...
- Move a definition of sentenceLevelBleuPlusOne() from pro.cpp
to BleuScorer.cpp.
- Add check for the length of an input vector.
2012-04-07 01:02:32 +09:00
Tetsuo Kiso
eaa0ab486a
Add a test case for BLEU's clipped counts.
...
- Make BleuScorer::setReferenceFiles() more testable by
adding OpenReference() and OpenReferenceStream().
2012-04-04 22:33:30 +09:00
Tetsuo Kiso
8987fed667
Add thread unsafe Singleton class.
...
- Add Vocabulary factory and the unit test.
- Remove Scorer::ClearVocabulary().
2012-03-20 05:49:10 +09:00
Tetsuo Kiso
525f06452c
Change the Encoder class to Vocabulary.
...
- Introduce the namespace to avoid naming collisions. The class name
is used in KenLM.
- Add the unit test.
2012-03-20 03:43:04 +09:00
Tetsuo Kiso
2b28072f7a
Move Encoder class from Scorer.h to Ngram.h.
...
To add unit tests.
2012-03-19 23:21:02 +09:00
Tetsuo Kiso
f686e8771a
Add some functions to BleuScorer for unit testing.
...
This commit also includes
- Fix typo.
- Fix indentations.
- Add 'const' to Scorer::applyFactors().
2012-03-19 22:45:15 +09:00
Tetsuo Kiso
6b95a19eda
Create Reference class to clean up BleuScorer.
...
- Add an unit test for Reference.
- Move functions to calculate the reference length from
BleuScorer to Reference.
2012-03-18 05:58:40 +09:00
Tetsuo Kiso
c6536a134b
Clean up BleuScorer.
2012-03-14 22:44:51 +09:00
Tetsuo Kiso
5007f129d8
Clean up BleuScorer with lookup().
2012-03-14 22:41:29 +09:00
Tetsuo Kiso
fba01c7cdf
Create a header file for NgramCounts class.
...
The reason is that we want to add the unit test.
2012-03-14 22:14:11 +09:00
Tetsuo Kiso
ed6e6f00b1
Minor change for calculating BLEU.
...
To avoid defining the similar variables twice to calculate
document-wise BLEU and sentence-wise BLEU scores.
2012-03-10 02:49:31 +09:00
Matous Machacek
ba987c94ba
Support for using factors in mert and evaluator
...
example:
Use --factor "0|2" to use only first and third factor from nbest list and from reference.
If you use interpolated scorer, separate records with comma (e.g. --factor "0|2,1").
2012-02-28 02:27:23 +01:00
Tetsuo Kiso
c26e83fd09
Remove obsolete and unused logging statements.
2012-02-26 02:19:40 +09:00
Tetsuo Kiso
224c654fa5
Don't repeat calling functions many times.
...
Consider using constants the result if it is possible.
2012-02-26 02:12:59 +09:00
Tetsuo Kiso
669b9d9c7a
Minor change the logging utility for n-gram counts.
...
Use std::ostream instead of directly using std::cerr.
2012-02-26 02:01:03 +09:00
Tetsuo Kiso
8e0a61d0d7
Clean up calculation effective reference length.
2012-02-26 01:54:51 +09:00
Tetsuo Kiso
c4fa8a3865
Add a more efficient member to set up ScoreStats.
...
- Remove unnecessary conversions.
- Add 'const' to local variables.
2012-02-26 01:41:17 +09:00
Tetsuo Kiso
2c2bd63bbd
Replace string objects with const char[].
2012-02-26 01:18:08 +09:00
Tetsuo Kiso
17f06a3250
Hide the implementation details of Ngram counts from the header.
2012-02-26 01:11:56 +09:00
Tetsuo Kiso
0c9023abc6
Clean up commented out code snippets for debugging purposes.
2012-02-25 18:14:00 +09:00
Tetsuo Kiso
17e864e446
Create private class to encapssulate encoding process.
...
Instead of using typedefs inside a class only,
it might be better to create a private class to do same things.
2012-02-01 21:19:25 +09:00
Tetsuo Kiso
b19e7777ce
Add prefix 'm_' to private and protected members in Scorer classes.
2012-02-01 20:54:20 +09:00
Tetsuo Kiso
30fa97e404
Move reference length type into a private member of BleuScorer.
...
The reason is that the type is used as internal purpose.
2012-02-01 20:24:48 +09:00
Tetsuo Kiso
3ef03a77c4
Change casts to C++ style casts.
2012-02-01 18:13:00 +09:00
Tetsuo Kiso
142342f8be
Change casts to C++ style casts, and delete unnecessary casts.
2012-02-01 17:17:58 +09:00
Tetsuo Kiso
2fde1cab0e
Add missing headers.
2011-11-14 19:52:21 +09:00
Tetsuo Kiso
29c16d252a
Minimize using #include headers in headers.
...
Should use it in .cpp files.
2011-11-14 15:15:30 +09:00
Tetsuo Kiso
00b8c6d768
Use const Scorer::calculateScore().
2011-11-12 10:40:54 +09:00
Tetsuo Kiso
d776281b8b
Simple refactoring of BLEU scorer.
2011-11-12 10:21:08 +09:00
Tetsuo Kiso
43beb88df5
Fix constructors of scorer classes and optimizer classes.
...
Using public const members is not good idea.
It should be initialized in private by constructors.
2011-11-12 10:16:31 +09:00
Tetsuo Kiso
664ffe0130
Fix indentation.
2011-11-12 09:24:19 +09:00
Tetsuo Kiso
68315d6407
Fix class, function, and implementation comments format.
...
Functions comments should be placed in their declarations.
2011-11-12 08:58:23 +09:00
Tetsuo Kiso
087756b8c3
Fix memory leaks in extractor.
2011-11-11 20:02:26 +09:00
hieuhoang1972
148c1e8305
run beautify.perl. Consistent formatting for .h & .cpp files
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3899 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-24 12:42:19 +00:00
nicolabertoldi
0393183eb4
mert software now works with different reference length policies: shortest, average, closest (default) and with case information (default is preserving case). Pay attention that both defaults are different from the previous version (which were shortest reflen and case-insensitive).
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2459 1f5c12ca-751b-0410-a591-d2e778427230
2009-08-05 15:38:35 +00:00
bhaddow
83f234cf17
Implementation of Cer et al mert regularisation. Use with argument such
...
as --scconfig regtype:min,regwin:3 in extractor and mert. Only tested
on toy example so far.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1860 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-24 19:27:18 +00:00
nicolabertoldi
e94834012d
added facilities to read and write score statistics in binary format
...
moved facilities for feature names in FeatureData object
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1824 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-05 17:03:54 +00:00
nicolabertoldi
291260abf7
- made output more compliant with old version
...
- added PerSCorer.h and BleuScorer.h
- stored feature names
- fixed bug about output of best Point
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1796 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-27 16:50:52 +00:00
nicolabertoldi
c9593648bb
change from int to unsigned where needed
...
add some debugging output (to remove later)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1794 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-23 11:48:16 +00:00