mirror of
https://github.com/moses-smt/mosesdecoder.git
synced 2024-12-26 13:23:25 +03:00
38d790cac0
The code uses two mechanisms for generating random numbers: srand()/rand(), which is not thread-safe, and srandom()/random(), which is POSIX-specific. Here I add a util/random.cc module that centralizes these calls, and unifies some common usage patterns. If the implementation is not good enough, we can now change it in a single place. To keep things simple, this uses the portable srand()/rand() but protects them with a lock to avoid concurrency problems. The hard part was to keep the regression tests passing: they rely on fixed sequences of random numbers, so a small code change could break them very thoroughly. Util::rand(), for wide types like size_t, calls std::rand() not once but twice. This behaviour was generalized into utils::wide_rand() and friends.
22 lines
933 B
Plaintext
22 lines
933 B
Plaintext
- check that mert-moses.pl emits devset score after every iteration
|
|
- correctly for whichever metric we are optimizing
|
|
- even when using --pairwise-ranked (PRO)
|
|
- this may make use of 'evaluator', soon to be added by Matous Machacek
|
|
|
|
- check that --pairwise-ranked is compatible with all optimization metrics
|
|
|
|
- Use better random generators in util/random.cc, e.g. boost::mt19937.
|
|
- Support plugging of custom random generators.
|
|
|
|
Pros:
|
|
- In MERT, you might want to use the random restarting technique to avoid
|
|
local optima.
|
|
- PRO uses a sampling technique to choose candidate translation pairs
|
|
from N-best lists, which means the choice of random generators seems to
|
|
be important.
|
|
|
|
Cons:
|
|
- This change will require us to re-create the truth results for regression
|
|
testing related to MERT and PRO because the new random generator will
|
|
generate different numbers from the current generator does.
|