<s> and </s> throw up is optional, but default.
If a binary file makes it to the ARPA parser (somebody gzipped a binary file or passed it build binary), the message is more informative.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3905 1f5c12ca-751b-0410-a591-d2e778427230
- Fix case where "foo bar baz" appears but "bar baz" does not. Previously probing silently returned the wrong answer and trie silently broke.
- More aggressive recombination: if "baz quux" is never followed by any word, then do not include "bar" in the state.
- kenlm assumes that "foo bar" is present if "foo bar baz" is. This is now checked.
- Binary format version number bump because the format has changed to support the above.
- Lower memory consumption trie building. But it will take longer for to ensure correct handling of blanks and aggressive recombination.
- Fix progress bar newlines on trie building.
Agrees with SRI's 1-best outputs on the WMT 10 evaluation set.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3847 1f5c12ca-751b-0410-a591-d2e778427230