This is not pleasant to read (and much, much less pleasant to write!) but
sort of necessary in an open project. Right now it's quite hard to figure
out what is licensed how, which doesn't matter much to most people but can
suddenly become very important when people want to know what they're being
allowed to do.
I kept the notices as short as I could. As far as I could see, everything
without a clear license notice is LGPL v2.1 or later.
This is lint reported by the new lint-checking functionality in beautify.py.
(We can change to a different lint checker if we have a better one, but it
would probably still flag these same problems.)
Lint checking can help a lot, but only if we get the lint under control.
gives better results in (small) test, and the code already had a placeholder for it.
(without truecasing, the recaser is more likely to uppercase words like "the" if they are often sentence-initial in the training corpus)
If people don't want the default behavior changed, I can disable truecasing by default and add a command line parameter to enable it.
It is kind of hard to identify the cause of a problem (or even to see there is a problem) if a script continues when a
main step failed. Better to exit when the error occurs with relevant logs.
By default, it will still use SRILM so that any previous use of this script from others won't be broken.
To switch to IRSTLM training, simply add "-lm irslm" command line option.
Also if build-lm.sh is not accessible from $PATH, the option "-build-lm /path/to/build-lm.sh" is also available.
* training script now has proper default behaviour for single-factor models,
* mert script has better handling of default lambda parameters that now
works with lexicalized reordering models, and also with multiple
models files (e.g. multiple language models)
* parallel mert script is more robust when single jobs fail: detects it
and resubmits the crashed (or killed) jobs
* recaser added that builds on moses
* filtering script added that also binarizes filtered model files
(this will be eventually replaced when the lexicalized reordering
model also uses the binary format)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1210 1f5c12ca-751b-0410-a591-d2e778427230