Matthias Huck
4ee8f2dec1
sentence-bleu less greedy regarding memory
...
Don't load all references, read them line by line.
Corpora with millions of sentences can now be evaluated without consuming gigabytes of RAM.
2015-04-30 22:26:30 +01:00
Matthias Huck
34d1d3a904
sentence-bleu-nbest
2015-04-30 19:44:29 +01:00
Rico Sennrich
3d00e5dc8c
basic support for more metrics with kbmira
...
metrics need getReferenceLength (for background smoothing) to work with kbmira
2014-09-22 10:49:20 +01:00
Hieu Hoang
862e1ad4ae
more gcc compile errors
2013-11-14 19:15:53 +00:00
Hieu Hoang
6249432407
beautify
2013-05-29 18:16:15 +01:00
Tetsuo Kiso
8fdec9bf30
Use boost::unordered_map instead of std::map.
...
For storing the word vocabulary used in computation of
BLEU scores. This change will reduce the running time
of extractor about 2-3 seconds (9% reduction).
2012-12-07 05:12:24 +09:00
Hieu Hoang
7d664b745e
Integrate Lexi's LR Score into tuning
2012-07-10 09:25:00 +01:00
Hieu Hoang
e3dd3a8d2c
namespace all classes in mert directory
2012-06-30 20:23:45 +01:00
Hieu Hoang
00f018a477
Merge https://github.com/moses-smt/mosesdecoder into lrscore
2012-06-25 16:57:17 -04:00
Hieu Hoang
8498b17a41
gcc version-specific error
2012-06-25 14:45:45 +01:00
Hieu Hoang
0cb63edcb9
merge Lexi Birch's LRScore from mert_mtm5 branch. Compiles and run. Hack, must double check with barry or lexi
2012-06-23 22:51:48 -04:00
Hieu Hoang
7d19fe13ae
merge Lexi Birch's LRScore from mert_mtm5 branch
2012-06-22 18:19:16 +01:00
Tetsuo Kiso
9c9d88a78a
Avoid "using namespace std" in headers.
2012-05-10 07:51:05 +09:00
Tetsuo Kiso
5f7967402a
Reduce compilation dependencies.
2012-05-10 07:16:38 +09:00
Tetsuo Kiso
afa356aec4
Small changes to just improve the quality.
...
- Use forward declaration to reduce dependencies.
- Add "virtual" to the destructor of _fdstream class.
- Avoid using namespace std in header.
- We have already used in mert a lot, though. It should be fixed.
- Fix warnings "-Wreorder".
- Fix the usage of enum.
2012-05-10 06:57:44 +09:00
Matous Machacek
440650bd6e
Added support for external unix filters to preprocess sentences in mert and evaluator
2012-05-09 19:21:41 +02:00
Tetsuo Kiso
8987fed667
Add thread unsafe Singleton class.
...
- Add Vocabulary factory and the unit test.
- Remove Scorer::ClearVocabulary().
2012-03-20 05:49:10 +09:00
Tetsuo Kiso
525f06452c
Change the Encoder class to Vocabulary.
...
- Introduce the namespace to avoid naming collisions. The class name
is used in KenLM.
- Add the unit test.
2012-03-20 03:43:04 +09:00
Tetsuo Kiso
2b28072f7a
Move Encoder class from Scorer.h to Ngram.h.
...
To add unit tests.
2012-03-19 23:21:02 +09:00
Tetsuo Kiso
f686e8771a
Add some functions to BleuScorer for unit testing.
...
This commit also includes
- Fix typo.
- Fix indentations.
- Add 'const' to Scorer::applyFactors().
2012-03-19 22:45:15 +09:00
Matous Machacek
ba987c94ba
Support for using factors in mert and evaluator
...
example:
Use --factor "0|2" to use only first and third factor from nbest list and from reference.
If you use interpolated scorer, separate records with comma (e.g. --factor "0|2,1").
2012-02-28 02:27:23 +01:00
Matous Machacek
e8a94a7bd2
Added interpolated scorer
...
example: to interpolate BLEU and CDER use --sctype=BLEU,CDER
to specify weights use --scconfig=weights:0.3+0.7
This scorer should replace MergeScorer (which requires mert-moses-multi.pl) soon.
Interpolated scorer is more universal and is used in the same way as other scorers.
2012-02-26 18:53:08 +01:00
Tetsuo Kiso
47ac8a474d
Change the naming conventions for the guard macros; Rename TER directory.
...
This change might be useful to avoid duplicating the names.
The reason is that although MERT programs are standalone
applications, some header files such as data.h and
point.h have common guard macro names like "DATA_H" and
"POINT_H", and this is not good naming conventions
when you want to include external headers.
Some files actually include headers in Moses and KenLM's util.
2012-02-20 09:46:08 +09:00
Tetsuo Kiso
5cd5b90d0d
Create a initialize function.
2012-02-01 21:26:47 +09:00
Tetsuo Kiso
17e864e446
Create private class to encapssulate encoding process.
...
Instead of using typedefs inside a class only,
it might be better to create a private class to do same things.
2012-02-01 21:19:25 +09:00
Tetsuo Kiso
a351a74c18
Move regularizaion type into StatisticsBasedScorer.
...
The type is used as internal purpose.
2012-02-01 20:58:49 +09:00
Tetsuo Kiso
b19e7777ce
Add prefix 'm_' to private and protected members in Scorer classes.
2012-02-01 20:54:20 +09:00
Tetsuo Kiso
3ef03a77c4
Change casts to C++ style casts.
2012-02-01 18:13:00 +09:00
Tetsuo Kiso
29c16d252a
Minimize using #include headers in headers.
...
Should use it in .cpp files.
2011-11-14 15:15:30 +09:00
Tetsuo Kiso
54b3b846c7
Add const member functions in Scorer classes.
2011-11-12 10:58:14 +09:00
Tetsuo Kiso
00b8c6d768
Use const Scorer::calculateScore().
2011-11-12 10:40:54 +09:00
Tetsuo Kiso
43beb88df5
Fix constructors of scorer classes and optimizer classes.
...
Using public const members is not good idea.
It should be initialized in private by constructors.
2011-11-12 10:16:31 +09:00
Tetsuo Kiso
ce9a628ed0
Remove unnecessary semicolons used in end of member functions.
2011-11-12 09:40:01 +09:00
Tetsuo Kiso
664ffe0130
Fix indentation.
2011-11-12 09:24:19 +09:00
Tetsuo Kiso
68315d6407
Fix class, function, and implementation comments format.
...
Functions comments should be placed in their declarations.
2011-11-12 08:58:23 +09:00
Tetsuo Kiso
4f6d022fe7
Add comments to mark the end of #define guards.
2011-11-12 07:59:50 +09:00
servan
f223f5a276
M mert/TerScorer.cpp
...
M mert/BleuScorer.h
M mert/ScorerFactory.h
M mert/Scorer.h
M mert/PerScorer.h
M mert/TerScorer.h
M mert/Makefile.am
AM scripts/training/mert-moses-multi.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4299 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-05 13:36:17 +00:00
hieuhoang1972
148c1e8305
run beautify.perl. Consistent formatting for .h & .cpp files
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3899 1f5c12ca-751b-0410-a591-d2e778427230
2011-02-24 12:42:19 +00:00
nicolabertoldi
0393183eb4
mert software now works with different reference length policies: shortest, average, closest (default) and with case information (default is preserving case). Pay attention that both defaults are different from the previous version (which were shortest reflen and case-insensitive).
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2459 1f5c12ca-751b-0410-a591-d2e778427230
2009-08-05 15:38:35 +00:00
bhaddow
83f234cf17
Implementation of Cer et al mert regularisation. Use with argument such
...
as --scconfig regtype:min,regwin:3 in extractor and mert. Only tested
on toy example so far.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1860 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-24 19:27:18 +00:00
nicolabertoldi
e94834012d
added facilities to read and write score statistics in binary format
...
moved facilities for feature names in FeatureData object
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1824 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-05 17:03:54 +00:00
nicolabertoldi
af585bc492
nbest can be read from stdin, too
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1797 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-27 17:20:01 +00:00
nicolabertoldi
291260abf7
- made output more compliant with old version
...
- added PerSCorer.h and BleuScorer.h
- stored feature names
- fixed bug about output of best Point
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1796 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-27 16:50:52 +00:00
nicolabertoldi
c9593648bb
change from int to unsigned where needed
...
add some debugging output (to remove later)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1794 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-23 11:48:16 +00:00
nicolabertoldi
8cf59edcdc
remove loadnbest from FeatureData and Scoredata; change test_scorer accordingly;
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1787 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-20 14:33:47 +00:00
jfouet
b231ffc8b1
add Types.h to unify the typedefs
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1713 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-15 19:09:01 +00:00
bhaddow
933400a503
extractor uses scorer factory
...
remove feature_extractor
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1710 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-15 17:14:11 +00:00
bhaddow
c0643d47f2
Add scorer factory. Fix compile error in Optimizer
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1706 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-15 16:03:49 +00:00
bhaddow
f320cf5174
Refactor PerScorer and BleuScorer to remove common code
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1704 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-15 14:48:11 +00:00
bhaddow
f98de25e70
Stub out per scorer. some refactoring to make this possible
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1681 1f5c12ca-751b-0410-a591-d2e778427230
2008-05-14 20:36:11 +00:00