The code uses two mechanisms for generating random numbers: srand()/rand(),
which is not thread-safe, and srandom()/random(), which is POSIX-specific.
Here I add a util/random.cc module that centralizes these calls, and unifies
some common usage patterns. If the implementation is not good enough, we can
now change it in a single place.
To keep things simple, this uses the portable srand()/rand() but protects them
with a lock to avoid concurrency problems.
The hard part was to keep the regression tests passing: they rely on fixed
sequences of random numbers, so a small code change could break them very
thoroughly. Util::rand(), for wide types like size_t, calls std::rand() not
once but twice. This behaviour was generalized into utils::wide_rand() and
friends.
Regression tests all check out, and kbmira seems to work fine
on a Hansard French->English task.
HypPackEnumerator class may be of interest to pro.cpp and future
optimizers, as it abstracts a lot of the boilerplate involved in
enumerating multiple k-best lists.
MiraWeightVector is not really mira-specific - just a weight vector
that enables efficient averaging. Could be useful to a perceptron
as well. Same goes for MiraFeatureVector.
Interaction with sparse features is written, but untested.