Commit Graph

8 Commits

Author SHA1 Message Date
heafield
3274f72bfb More documentation
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3951 1f5c12ca-751b-0410-a591-d2e778427230
2011-04-19 15:17:01 +00:00
heafield
a3385c3905 Chris Dyer wanted #include <cstddef> for newer g++
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3945 1f5c12ca-751b-0410-a591-d2e778427230
2011-04-06 21:14:24 +00:00
heafield
fa04e673bf Minor fixes: unused parameter, factor optional components into a central header.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3849 1f5c12ca-751b-0410-a591-d2e778427230
2011-01-26 01:19:11 +00:00
heafield
22ce1d2f19 kenlm update
- Fix case where "foo bar baz" appears but "bar baz" does not.  Previously probing silently returned the wrong answer and trie silently broke.  
- More aggressive recombination: if "baz quux" is never followed by any word, then do not include "bar" in the state.  
- kenlm assumes that "foo bar" is present if "foo bar baz" is.  This is now checked.  
- Binary format version number bump because the format has changed to support the above.  
- Lower memory consumption trie building.  But it will take longer for to ensure correct handling of blanks and aggressive recombination.  
- Fix progress bar newlines on trie building.

Agrees with SRI's 1-best outputs on the WMT 10 evaluation set.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3847 1f5c12ca-751b-0410-a591-d2e778427230
2011-01-25 19:11:48 +00:00
heafield
a596b48971 Fix --enable-shared compilation.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3796 1f5c12ca-751b-0410-a591-d2e778427230
2011-01-11 19:32:59 +00:00
heafield
614d6002a6 Integrate heafield-refactorlm. Faster kenlm with new binary format. Stateful language model
framework.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3671 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-27 17:50:40 +00:00
heafield
d00c788760 kenlm update
mmap works; utility to build binary format included.  
Configuration struct (including unknown handling options). 
config option to build a binary format while loading an ARPA.  
Doesn't require Boost or ICU. 
Works on 32 and 64 bit. 
query appends </s>. 
Reduced memory consumption: 12 bytes per 5-gram instead of 16 bytes on 64-bit machines.  
Reduced memory consumption: vocabulary takes 8 bytes/word instead of 12 bytes/word if sorted is 
used. 
Removed some cruft that wasn't needed by this code.  
Compiles on Mac OS X.  
Add script to run tests; these depend on Boost.  
SRI wrapper works again, is slightly faster, no longer depends on Boost, and has a test.
Debugging code only appears with -DDEBUG, so the default is fast.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3447 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-14 21:33:11 +00:00
hieuhoang1972
473e0e3e96 Ken's LM
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3421 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-10 00:36:07 +00:00