Commit Graph

84 Commits

Author SHA1 Message Date
Jeroen Vermeulen
38d790cac0 Add cross-platform randomizer module.
The code uses two mechanisms for generating random numbers: srand()/rand(),
which is not thread-safe, and srandom()/random(), which is POSIX-specific.

Here I add a util/random.cc module that centralizes these calls, and unifies
some common usage patterns.  If the implementation is not good enough, we can
now change it in a single place.

To keep things simple, this uses the portable srand()/rand() but protects them
with a lock to avoid concurrency problems.

The hard part was to keep the regression tests passing: they rely on fixed
sequences of random numbers, so a small code change could break them very
thoroughly.  Util::rand(), for wide types like size_t, calls std::rand() not
once but twice.  This behaviour was generalized into utils::wide_rand() and
friends.
2015-04-23 23:46:04 +07:00
Jeroen Vermeulen
75bfb75882 Thread-safe, platform-agnostic randomizer.
Some places in mert use srandom()/random(), but these are POSIX-specific.
The standard alternative, srand()/rand(), is not thread-safe.  This module
wraps srand()/rand() in mutexes (very short-lived, so should not cost much)
so that it relies on just Boost and the C standard library, not on a Unix-like
environment.

This may reduce the width of the random numbers on some platforms: it goes
from "long int" to just "int".  If that is a problem, we may have to use
Boost's randomizer utilities, or eventually, the C++ ones.
2015-04-22 20:43:29 +07:00
Nicola Bertoldi
e4eb201c52 merged master into dynamic-models and solved conflicts 2014-12-13 12:52:47 +01:00
Rico Sennrich
d39cbca0b9 (optionally) use n-best file for evaluator/return-best-dev
this adds support for metrics that rely on alignment / trees
2014-09-22 10:49:20 +01:00
Rico Sennrich
f40bb2c53c HWCM for MERT 2014-09-22 10:49:20 +01:00
Nicola Bertoldi
e452a13062 beautify 2014-01-15 16:49:57 +01:00
Hieu Hoang
d9be81596e replace CHECK with UTIL_THROW_IF in mert 2013-11-18 18:13:10 +00:00
Hieu Hoang
6249432407 beautify 2013-05-29 18:16:15 +01:00
phikoehn
4cdffc8a89 fixes for sparse feature handling 2013-05-17 08:37:29 +01:00
hieu
05045d574c don't display unknown weight penalty when showing weight, don't usually tune. Also, change delimiter in mert extractor from : to = 2012-12-16 18:29:53 +00:00
Tetsuo Kiso
c7f6e38326 Use FilePiece to load N-best lists.
Since FilePiece is friendly with StringPiece.
2012-12-07 02:39:02 +09:00
Tetsuo Kiso
38e145e556 Use util::TokenIter to tokenize n-best lists.
Reduce creating std::string objects, too. In both ScoreArray
and FeatureArray classes, the private members to track sentence
indices (namely, "m_index") were unnecessarily declared as
std::string, but it's better to directly declare them as 'int'.
2012-12-07 01:39:22 +09:00
Tetsuo Kiso
cd3fb3b831 Untabify. 2012-12-06 23:46:22 +09:00
Barry Haddow
0a950ee9f4 Merge remote branch 'github/master' into miramerge
Compiles, but not tested. Had to disable relent filter. Strangely, it seems to contain the
whole of moses-cmd.

Conflicts:
	Jamroot
	OnDiskPt/TargetPhrase.cpp
	moses-cmd/src/Main.cpp
	moses/src/AlignmentInfo.cpp
	moses/src/AlignmentInfo.h
	moses/src/ChartTranslationOptionCollection.cpp
	moses/src/ChartTranslationOptionCollection.h
	moses/src/GenerationDictionary.cpp
	moses/src/Jamfile
	moses/src/Parameter.cpp
	moses/src/PhraseDictionary.cpp
	moses/src/StaticData.cpp
	moses/src/StaticData.h
	moses/src/TargetPhrase.h
	moses/src/TranslationSystem.cpp
	moses/src/TranslationSystem.h
	moses/src/Word.cpp
	phrase-extract/score.cpp
	regression-testing/Jamfile
	scripts/ems/experiment.meta
	scripts/ems/experiment.perl
	scripts/training/train-model.perl
2012-09-26 22:49:33 +01:00
Arianna Bisazza
ff276e9911 Fixed several bugs in LRscore-MERT. Namely, solved a float-to-int conversion; added hypothesis counter to the scores file to enable later computation of average reordering score; fixed special case of 1-word hypothesis; enabled reading of word-based alignments from n-best-list. 2012-09-24 15:40:18 +02:00
Barry Haddow
2b4e61d826 Merge branch 'trunk' into miramerge
Compiles, not tested.

Conflicts:
	Jamroot
	OnDiskPt/PhraseNode.h
	OnDiskPt/TargetPhrase.cpp
	OnDiskPt/TargetPhrase.h
	OnDiskPt/TargetPhraseCollection.cpp
	mert/BleuScorer.cpp
	mert/Data.cpp
	mert/FeatureData.cpp
	moses-chart-cmd/src/Main.cpp
	moses/src/AlignmentInfo.h
	moses/src/ChartManager.cpp
	moses/src/LM/Ken.cpp
	moses/src/LM/Ken.h
	moses/src/LMList.h
	moses/src/LexicalReordering.h
	moses/src/PhraseDictionaryTree.h
	moses/src/ScoreIndexManager.h
	moses/src/StaticData.h
	moses/src/TargetPhrase.h
	moses/src/Word.cpp
	scripts/ems/experiment.meta
	scripts/ems/experiment.perl
	scripts/training/train-model.perl
2012-07-17 13:36:50 +01:00
Hieu Hoang
7d664b745e Integrate Lexi's LR Score into tuning 2012-07-10 09:25:00 +01:00
Hieu Hoang
e3dd3a8d2c namespace all classes in mert directory 2012-06-30 20:23:45 +01:00
Barry Haddow
c397d2068b Merge branch 'trunk' into miramerge. Still to fix build.
Conflicts:
	Jamroot
	mert/Data.cpp
	mert/Data.h
	mert/FeatureArray.cpp
	mert/FeatureArray.h
	mert/FeatureData.cpp
	mert/FeatureData.h
	mert/FeatureStats.cpp
	mert/FeatureStats.h
	mert/mert.cpp
	moses-chart-cmd/src/IOWrapper.h
	moses-chart-cmd/src/Main.cpp
	moses-cmd/src/IOWrapper.cpp
	moses-cmd/src/IOWrapper.h
	moses-cmd/src/Main.cpp
	moses/src/GlobalLexicalModel.cpp
	moses/src/Jamfile
	moses/src/Parameter.cpp
	moses/src/PhraseDictionary.cpp
	moses/src/ScoreIndexManager.h
	moses/src/TargetPhrase.h
	regression-testing/tests/phrase.lexicalized-reordering-bin/truth/results.txt
	regression-testing/tests/phrase.lexicalized-reordering-cn/truth/results.txt
	regression-testing/tests/phrase.lexicalized-reordering/truth/results.txt
	regression-testing/tests/phrase.multiple-translation-system-lr/truth/results.txt
	regression-testing/tests/phrase.show-weights.lex-reorder/truth/results.txt
	regression-testing/tests/phrase.show-weights/truth/results.txt
	scripts/ems/experiment.meta
	scripts/ems/experiment.perl
	scripts/training/filter-model-given-input.pl
	scripts/training/mert-moses.pl
2012-05-24 21:11:35 +01:00
Tetsuo Kiso
dbfe766f2c Fix using directive refers to implicitly-defined namespace 'std'. 2012-05-06 05:27:04 +09:00
Tetsuo Kiso
df4586740d Fix using directive refers to implicitly-defined namespace 'std'. 2012-05-06 05:27:04 +09:00
Tetsuo Kiso
fe79b96328 Use std::stringstream instead of using snprintf() for Windows.
This commit fixes compilation problems related to
snprintf() for Windows users.

Thanks to Raka Prasetya for reporting the errors.
Thanks also to Kenneth Heafield and Barry Haddow for suggestions.
2012-04-18 23:47:48 +09:00
Tetsuo Kiso
bd79fc2c13 Use std::stringstream instead of using snprintf() for Windows.
This commit fixes compilation problems related to
snprintf() for Windows users.

Thanks to Raka Prasetya for reporting the errors.
Thanks also to Kenneth Heafield and Barry Haddow for suggestions.
2012-04-18 23:47:48 +09:00
Tetsuo Kiso
e2a92c0f91 Use EndsWith(). 2012-04-05 00:03:13 +09:00
Tetsuo Kiso
8a2495c966 Use EndsWith(). 2012-04-05 00:03:13 +09:00
Tetsuo Kiso
27515f5de1 Add a function to check whether a string ends with a suffix.
- Use the function in Data::InitFeatureMap().
- Add an unit test for InitFeatureMap().
- Move helper functions for Data::loadnbest() to public for unit testing.
2012-04-04 22:04:51 +09:00
Tetsuo Kiso
1ade69a546 Add a function to check whether a string ends with a suffix.
- Use the function in Data::InitFeatureMap().
- Add an unit test for InitFeatureMap().
- Move helper functions for Data::loadnbest() to public for unit testing.
2012-04-04 22:04:51 +09:00
Tetsuo Kiso
3ce46da4cd Clean up Data; add TODOs. 2012-03-10 17:47:01 +09:00
Tetsuo Kiso
81309bdb2d Clean up Data; add TODOs. 2012-03-10 17:47:01 +09:00
Tetsuo Kiso
b5bcf48b17 Pass by pointers to Scorer instead of references. 2012-03-10 17:28:38 +09:00
Tetsuo Kiso
a1ab79c7fc Pass by pointers to Scorer instead of references. 2012-03-10 17:28:38 +09:00
Tetsuo Kiso
e7a2483b22 mert: Prefix private members with "m_" except TER.
Squashed commit of the following:

- Clean up PRO.
- Clean up ScoreStats.
- Clean up ScoreData.
- Clean up ScoreArray.
- Remove unnecessary headers.
- Clean up ScopedVector.
- Clean up Point.
- Clean up PerScorer.
- Clean up Optimizer.
- Clean up MergeScorer.
- Clean up InterpolatedScorer.
- Clean up FileStream.
- Clean up FeatureStats.
- Remove inefficient string concatenation.
- Clean up FeatureData.
- Clean up FeatureArray.
- Clean up Data.
2012-03-10 17:12:34 +09:00
Tetsuo Kiso
eb2c9ee5e3 mert: Prefix private members with "m_" except TER.
Squashed commit of the following:

- Clean up PRO.
- Clean up ScoreStats.
- Clean up ScoreData.
- Clean up ScoreArray.
- Remove unnecessary headers.
- Clean up ScopedVector.
- Clean up Point.
- Clean up PerScorer.
- Clean up Optimizer.
- Clean up MergeScorer.
- Clean up InterpolatedScorer.
- Clean up FileStream.
- Clean up FeatureStats.
- Remove inefficient string concatenation.
- Clean up FeatureData.
- Clean up FeatureArray.
- Clean up Data.
2012-03-10 17:12:34 +09:00
Tetsuo Kiso
127f958bed Remove an unused variable and unnecessary 'std::'. 2012-03-07 07:19:24 +09:00
Tetsuo Kiso
851a1835b6 Remove an unused variable and unnecessary 'std::'. 2012-03-07 07:19:24 +09:00
Tetsuo Kiso
07d42f7614 Remove an unused variable. 2012-03-07 07:07:29 +09:00
Tetsuo Kiso
6ada41576c Remove an unused variable. 2012-03-07 07:07:29 +09:00
Tetsuo Kiso
6b1dfa3434 Clean up Data::loadnbest().
Add helper functions.
2012-03-07 07:01:28 +09:00
Tetsuo Kiso
2bdeee9caa Clean up Data::loadnbest().
Add helper functions.
2012-03-07 07:01:28 +09:00
Tetsuo Kiso
82da44b030 Fix typo. 2012-02-20 08:29:53 +09:00
Tetsuo Kiso
94888b258d Fix typo. 2012-02-20 08:29:53 +09:00
Barry Haddow
62d7d034bb Fix sharding bug 2012-02-08 17:11:56 +00:00
Barry Haddow
752724594e Fix sharding bug 2012-02-08 17:11:56 +00:00
Tetsuo Kiso
142342f8be Change casts to C++ style casts, and delete unnecessary casts. 2012-02-01 17:17:58 +09:00
Tetsuo Kiso
194e24115a Change casts to C++ style casts, and delete unnecessary casts. 2012-02-01 17:17:58 +09:00
Barry Haddow
ced24a881d Implementation of feature-merging for pro-mert 2012-01-13 16:52:15 +00:00
Hieu Hoang
575168c277 uint -> size_t 2011-12-12 23:27:27 +07:00
Hieu Hoang
ca0a3ea870 uint -> size_t 2011-12-12 23:27:27 +07:00
Hieu Hoang
753eebd959 revert 2011-12-12 20:48:42 +07:00
Hieu Hoang
21009b5d1e revert 2011-12-12 20:48:42 +07:00