Commit Graph

2045 Commits

Author SHA1 Message Date
hieuhoang1972
dd6c1e722e use gzipped extract file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3729 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-23 14:30:36 +00:00
hieuhoang1972
867a9bdf4b use gzipped extract file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3728 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-23 14:15:54 +00:00
xandfraser
c0c617a8c4 Added print word alignment in nbest
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3726 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-22 15:34:53 +00:00
bhaddow
6255216b6a Remove gnu-specific typeof
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3725 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-22 10:05:17 +00:00
hieuhoang1972
6f5d1e4732 deleting offending comment
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3724 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-21 16:35:31 +00:00
hieuhoang1972
4bc0a8e6b2 can set max num of lines for GT discount calc.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3723 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-19 20:11:10 +00:00
bhaddow
a7e0977eea Fix compile error by using correct macro.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3720 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-18 10:27:30 +00:00
chardmeier
837a667a95 Cleaned up language modelling code by disentangling the decoder's LM feature
function from the LM toolkit abstraction layer. There are two different groups
of classes now:
- LanguageModel, which inherits from StatefulFeatureFunction and contains
  the n-gram model feature function.
- LanguageModelImplementation, which is the base class of the individual
  LM implementations (SRI, IRST, RandLM, KenLM) and provides methods to
  query LM probabilities and states.
Each LanguageModel controls a LanguageModelImplementation. Implementations can
be shared by more than one LanguageModel.
This should make it easier to use the LM libraries as a backend for other
feature functions while retaining the flexibility to use different LM toolkits.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3719 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-17 14:06:21 +00:00
chardmeier
d18ff948f5 Bugfixes in srilm adaptor.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3718 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-17 13:23:44 +00:00
bojar
6616dd3f62 prettified usage string
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3714 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-16 00:26:50 +00:00
bojar
5c3a38bc2e fixed behaviour wrt to weight-d, don't expect it unconditionally as moses-chart
does not use it


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3713 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-16 00:00:17 +00:00
hieuhoang1972
57e3a92836 rollback. argument not supported by all iconv
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3712 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-15 12:50:11 +00:00
leven101
84d83480b6 function name changes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3711 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-15 11:32:02 +00:00
leven101
5251a2823a separated source and target vocab in suffixarrays to support unequal factors
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3710 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-15 11:28:27 +00:00
hieuhoang1972
ff339e56e3 don't drop unknown char. replace it with improbable string. avoid misalignment
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3709 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-14 20:50:15 +00:00
hieuhoang1972
f7904a871c add scripts to exclude unparseable sentences
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3704 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-12 14:43:52 +00:00
hieuhoang1972
687cf9bf29 add scripts to exclude unparseable sentences
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3702 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-12 14:20:11 +00:00
hieuhoang1972
a79a6bbaec add scripts to exclude unparseable sentences
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3700 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-11 18:04:16 +00:00
hieuhoang1972
f1f04daa0a add empty line if input is empty line
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3699 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 12:11:55 +00:00
bojar
2ea140062b don't warn about probs outside [0,1] in -verbose 0
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3698 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 11:51:26 +00:00
bojar
ff56054a03 removed --inputweights, read this information from link-param-count instead
added negatable --starting-weights-from-ini (defaulting to yes)
improved documentation of --activate-features


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3697 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 11:25:40 +00:00
bojar
9838de2a81 handle also gzipped ini files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3696 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 11:21:28 +00:00
nicolabertoldi
d38b319405 workaround to force the use of the bash shell in the SGE
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3695 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 10:32:34 +00:00
heafield
82f29bfc16 Chris Dyer says this should make things compile better on OS X.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3694 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-10 02:05:51 +00:00
hieuhoang1972
3b6b002df8 xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3691 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-09 13:25:09 +00:00
bgottesman
518035ed05 add --possiblyUseFirstToken option, which, when selected, allows certain sentence-initial tokens to be taken into account. See comment in header or support mailing list discussion.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3690 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-09 11:05:23 +00:00
hieuhoang1972
9a72825d29 mac compile
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3689 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-08 16:09:04 +00:00
heafield
2784923899 Rename a bunch of kenlm files. A ./regenerate-makefiles.sh is required.
Make loading with MAP_POPULATE on Linux and read on other OSes the default.
Use LM #9 for lazy loading, as recommended by other devs.  
Slightly faster trie.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3688 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-06 00:40:16 +00:00
leven101
34b45c0480 removed debug messages from BilingualDynSuffixArray.cpp
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3687 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-04 18:41:04 +00:00
bhaddow
3aee6fab5d Use correct conditional compilation flag for threaded moses
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3686 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-03 18:43:18 +00:00
heafield
bf88f87d78 Fix return value of FilePiece::ReadLine at end of file. Did not impact existing kenlm (since
they don't read to the end of file) but will impact future versions.  


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3682 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-29 17:53:19 +00:00
heafield
c12c2c59d2 Autodetect model from binary format.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3675 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-28 01:05:04 +00:00
hieuhoang1972
eb374bf082 cygwin build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3674 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-27 20:47:28 +00:00
hieuhoang1972
735d5b682f xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3673 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-27 18:54:50 +00:00
heafield
614d6002a6 Integrate heafield-refactorlm. Faster kenlm with new binary format. Stateful language model
framework.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3671 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-27 17:50:40 +00:00
nicolabertoldi
e1a5479928 modified the link to IRSTLM
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3669 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-27 13:18:07 +00:00
phkoehn
7334d49191 minor experiment.perl fixes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3668 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-27 12:42:34 +00:00
hieuhoang1972
46b59cbdd7 xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3667 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-27 10:20:33 +00:00
heafield
64cfacd1bd Backporting FilePiece leaked scoped_FILE, but only into the test.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3665 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-27 04:04:23 +00:00
hieuhoang1972
34e7c43114 xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3664 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-27 03:14:11 +00:00
nicolabertoldi
bb08dcb5b6 made code compliant with the enhanced IRSTLM library; IRSTLM release 5.50.01 is needed; back compatibility is not assured;
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3662 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 16:06:01 +00:00
pjwilliams
3ca16120a2 Add --MaxScope option to extract-rules (Hopkins and Langmead, 2010)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3661 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:55:57 +00:00
bojar
c0e0bc62c6 fixed a stupid bug from last commit
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3660 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:23:31 +00:00
bojar
878c7100de accept binarized ttables as well
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3659 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:22:01 +00:00
bojar
8cfc403fec default location of new mert
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3658 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 15:21:38 +00:00
heafield
d1b1b4f34c Tom from precision translation tools reports that IRST doesn't generate a blank line after each block. Removed this
requirement from the parser.  


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3657 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-26 14:04:32 +00:00
phkoehn
c8ae94e426 training for global lexicon model
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3655 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-25 16:24:59 +00:00
phkoehn
2a594c0e2a fixed xml regression test
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3646 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-23 21:27:34 +00:00
heafield
8d0d44f5cd Support gzipped ARPA files. Progress bar tweak. Test fixes. Holding off on the big change for now.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3643 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-23 05:21:10 +00:00
chardmeier
ecf4b0d368 Check for right boost version in memscore.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3640 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-22 14:36:43 +00:00