Commit Graph

100 Commits

Author SHA1 Message Date
Hieu Hoang
e0016712be Merge github.com:hieuhoang/mosesdecoder 2011-11-07 20:23:52 +07:00
Kenneth Heafield
0e9a172c1e Bugfix trie when 5-grams and 4-grams appear but 3-gram is blank 2011-11-03 19:51:54 +00:00
Hieu Hoang
fdd6c9795d xcode 2011-10-30 18:54:40 +07:00
Hieu Hoang
42924144fd Merge branch 'master' of github.com:moses-smt/mosesdecoder 2011-10-28 19:13:14 +07:00
Hieu Hoang
ae42f1aca0 Revert "xcode"
This reverts commit 743d5414e9.
2011-10-28 19:08:55 +07:00
Kenneth Heafield
c74d772bd9 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2011-10-24 17:55:18 +01:00
Kenneth Heafield
4b7a09dad6 Fix Trie building with pruned bigrams. There was a Bad File descriptor error. 2011-10-24 17:53:51 +01:00
Hieu Hoang
743d5414e9 xcode 2011-10-24 15:21:28 +07:00
Hieu Hoang
a93f4691f6 win32 2011-10-23 09:37:47 +07:00
Kenneth Heafield
8d1833a691 Corner case left state was not being fully minimized 2011-10-20 17:34:15 +01:00
Kenneth Heafield
ca773431c3 Reduce user e-mails related to IRSTLM formats 2011-10-19 17:52:37 +01:00
Kenneth Heafield
81dbd6574e Reduce header pollution 2011-10-19 14:32:20 +01:00
Kenneth Heafield
bcc036c587 Minor code cleanup 2011-10-19 11:00:57 +01:00
Hieu Hoang
01da665df5 xcode. kenlm fails for some reason 2011-10-18 19:31:08 +07:00
Hieu Hoang
f2814c65b9 xcode. kenlm fails for some reason 2011-10-18 19:28:50 +07:00
heafield
b13c341bc1 Remove some gcc-only variable sized arrays
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4382 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-17 09:42:36 +00:00
heafield
2bb2d6dc4a Reduce text phrase table loading time by 49.5%. Add a progress bar too. StringPiece is good for you.
This change introduces a dependency on Boost, which is now permitted in Moses.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4365 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-14 16:40:30 +00:00
heafield
967b725d73 Make Chris Dyer feel safer about compile time if statement handling
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4358 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-14 11:52:05 +00:00
heafield
7b129fa461 Add a test and a multi-token breaker
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4357 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-14 11:51:15 +00:00
heafield
15adb17e35 Move EnumerateVocab to namespace lm
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4335 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-12 10:18:23 +00:00
heafield
19f3f09a39 Updated left state minimization makes all states of length N-1 full
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4332 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-11 18:40:00 +00:00
heafield
86f1d3ec71 Fix trie for ARPAs from SRILM.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4331 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-11 18:27:36 +00:00
heafield
c9995dc44c Trie building bug fix
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4323 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-11 10:12:17 +00:00
heafield
71d0d389c5 Fix silly bug in merging
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4314 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-08 10:59:54 +00:00
hieuhoang1972
7538f5406a xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4278 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-28 00:30:18 +00:00
heafield
41cc547360 Fix a segfault with the chop variant of trie in chart decoding
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4272 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-26 20:54:41 +00:00
heafield
12e15abe8a Fix file naming bug
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4266 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-25 22:36:03 +00:00
hieuhoang1972
9762a65bd8 xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4264 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-25 10:44:47 +00:00
heafield
55e24da4d5 Sync with changes made for cdec. Documentation. Remove an unnecessary copy.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4262 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-24 15:24:33 +00:00
heafield
72bb0e51ac Add file used by a test.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4248 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-21 20:15:03 +00:00
heafield
3402bdfe7a Merge mtm_lm into trunk.
There's a fair number of files with no change that somebody must have touched in the branch so metadata is being recorded. 
Updates kenlm binary file format, sorry. 
It looks like OOV isn't being computed in EvaluateChart anyway, just phrasal.  
  


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4247 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-21 16:06:48 +00:00
hieuhoang1972
dca2b72a1c xcode macros for large file support
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4174 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-06 09:44:34 +00:00
heafield
6dae77c3eb Fix segfault withe trie on models without <unk>. Problem was that trie writes correct counts to
the binary file header, including <unk>.  But the vocabulary was sized based on the ARPA file 
count (excluding <unk>).  Then when the binary file was loaded, the vocabulary size was based on 
the count including <unk>.  Fix this by pre-padding vocabulary to the count including <unk>.  

Also, some minor cleanups: remove a debug message and change some always-true returns to void.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4145 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-16 12:57:21 +00:00
heafield
61974ad75e Minor fixes. One for David Chiang who has files without initial newlines.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4108 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 19:46:19 +00:00
hieuhoang1972
fd08431e3b xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4076 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-15 12:09:33 +00:00
heafield
954dfd7d5e Optional compression for trie. Also, some better error handling.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4074 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-13 20:53:18 +00:00
hieuhoang1972
f7d534bcdd xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4046 1f5c12ca-751b-0410-a591-d2e778427230
2011-06-28 19:02:09 +00:00
heafield
025ab3f7f0 Sorry I used a GCC-only dynamically sized array
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4041 1f5c12ca-751b-0410-a591-d2e778427230
2011-06-27 21:28:22 +00:00
heafield
3616cf09fb Fix accidental format change
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4040 1f5c12ca-751b-0410-a591-d2e778427230
2011-06-27 21:20:42 +00:00
heafield
5e70e3bd40 Quantization.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4037 1f5c12ca-751b-0410-a591-d2e778427230
2011-06-26 22:21:44 +00:00
heafield
e0d618528a Speed improvements mostly. Trie went from 803448 queries/s to 990520 queries/s by knowing what the bounds are in advance. Also, set read ahead for
files.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3988 1f5c12ca-751b-0410-a591-d2e778427230
2011-05-23 02:23:01 +00:00
heafield
606d3c72b3 Denis Filimonov points out that NFS really likes an msync before munmap.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3986 1f5c12ca-751b-0410-a591-d2e778427230
2011-05-20 19:34:52 +00:00
heafield
1e05ab182e Allow broken IRST ARPAs to still build but be passive-aggressive about it.
Slight update to SRI wrapper that nobody uses anyway.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3980 1f5c12ca-751b-0410-a591-d2e778427230
2011-05-17 16:43:05 +00:00
oliver-wilson
c45e757c1f Put back missing typedef that broke the build.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3952 1f5c12ca-751b-0410-a591-d2e778427230
2011-04-20 08:11:04 +00:00
heafield
3274f72bfb More documentation
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3951 1f5c12ca-751b-0410-a591-d2e778427230
2011-04-19 15:17:01 +00:00
heafield
a3385c3905 Chris Dyer wanted #include <cstddef> for newer g++
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3945 1f5c12ca-751b-0410-a591-d2e778427230
2011-04-06 21:14:24 +00:00
heafield
a7584f57d5 Don't malloc non-PODs without calling their constructor. Also, exception safety.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3942 1f5c12ca-751b-0410-a591-d2e778427230
2011-04-02 01:57:22 +00:00
hieuhoang1972
0fdde952bc compiles with clang++
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3940 1f5c12ca-751b-0410-a591-d2e778427230
2011-04-01 23:31:11 +00:00
hieuhoang1972
adc2ac2c6a xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3938 1f5c12ca-751b-0410-a591-d2e778427230
2011-03-30 20:31:09 +00:00
heafield
02c767a16f Hieu's opinion is to keep the standalone test shell scripts.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3937 1f5c12ca-751b-0410-a591-d2e778427230
2011-03-29 15:07:57 +00:00