mosesdecoder

mirror of https://github.com/moses-smt/mosesdecoder.git synced 2024-12-29 15:04:05 +03:00

Author	SHA1	Message	Date
heafield	6dae77c3eb	Fix segfault withe trie on models without <unk>. Problem was that trie writes correct counts to the binary file header, including <unk>. But the vocabulary was sized based on the ARPA file count (excluding <unk>). Then when the binary file was loaded, the vocabulary size was based on the count including <unk>. Fix this by pre-padding vocabulary to the count including <unk>. Also, some minor cleanups: remove a debug message and change some always-true returns to void. git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4145 1f5c12ca-751b-0410-a591-d2e778427230	2011-08-16 12:57:21 +00:00
heafield	954dfd7d5e	Optional compression for trie. Also, some better error handling. git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4074 1f5c12ca-751b-0410-a591-d2e778427230	2011-07-13 20:53:18 +00:00
heafield	5e70e3bd40	Quantization. git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4037 1f5c12ca-751b-0410-a591-d2e778427230	2011-06-26 22:21:44 +00:00
heafield	22ce1d2f19	kenlm update - Fix case where "foo bar baz" appears but "bar baz" does not. Previously probing silently returned the wrong answer and trie silently broke. - More aggressive recombination: if "baz quux" is never followed by any word, then do not include "bar" in the state. - kenlm assumes that "foo bar" is present if "foo bar baz" is. This is now checked. - Binary format version number bump because the format has changed to support the above. - Lower memory consumption trie building. But it will take longer for to ensure correct handling of blanks and aggressive recombination. - Fix progress bar newlines on trie building. Agrees with SRI's 1-best outputs on the WMT 10 evaluation set. git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3847 1f5c12ca-751b-0410-a591-d2e778427230	2011-01-25 19:11:48 +00:00
heafield	2784923899	Rename a bunch of kenlm files. A ./regenerate-makefiles.sh is required. Make loading with MAP_POPULATE on Linux and read on other OSes the default. Use LM #9 for lazy loading, as recommended by other devs. Slightly faster trie. git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3688 1f5c12ca-751b-0410-a591-d2e778427230	2010-11-06 00:40:16 +00:00

5 Commits