Commit Graph

45 Commits

Author SHA1 Message Date
bhaddow
a2730c445d Merge up to 3791 from trunk.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/mira-mtm5@3792 1f5c12ca-751b-0410-a591-d2e778427230
2011-01-05 13:49:44 +00:00
chardmeier
837a667a95 Cleaned up language modelling code by disentangling the decoder's LM feature
function from the LM toolkit abstraction layer. There are two different groups
of classes now:
- LanguageModel, which inherits from StatefulFeatureFunction and contains
  the n-gram model feature function.
- LanguageModelImplementation, which is the base class of the individual
  LM implementations (SRI, IRST, RandLM, KenLM) and provides methods to
  query LM probabilities and states.
Each LanguageModel controls a LanguageModelImplementation. Implementations can
be shared by more than one LanguageModel.
This should make it easier to use the LM libraries as a backend for other
feature functions while retaining the flexibility to use different LM toolkits.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3719 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-17 14:06:21 +00:00
heafield
2784923899 Rename a bunch of kenlm files. A ./regenerate-makefiles.sh is required.
Make loading with MAP_POPULATE on Linux and read on other OSes the default.
Use LM #9 for lazy loading, as recommended by other devs.  
Slightly faster trie.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3688 1f5c12ca-751b-0410-a591-d2e778427230
2010-11-06 00:40:16 +00:00
heafield
c12c2c59d2 Autodetect model from binary format.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3675 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-28 01:05:04 +00:00
bhaddow
7e72ceea22 Goodbye ScoreIndexManager.
Compiles ok, but haven't dared to run regression yet.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/branches/mira-mtm5@3608 1f5c12ca-751b-0410-a591-d2e778427230
2010-10-06 22:06:49 +00:00
hieuhoang1972
8fa18b50a7 xcode. And don't invoke internal LM when sri is specified, even if sri isn't compiled
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3583 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-28 11:02:58 +00:00
heafield
5b74b38527 Remove vestigial dub parameter. Surrender to tab-based whitespace. More passive-aggressive message about nGramOrder.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3575 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-27 16:01:58 +00:00
hieuhoang1972
32d3565b04 ken lm integration
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3543 1f5c12ca-751b-0410-a591-d2e778427230
2010-09-21 22:43:29 +00:00
bhaddow
904133fcb7 Merge in the multiple models branch. These changes allow the moses server
to support multiple translation, language and generation models within the
same process. The main design change is the introduction of a TranslationSystem
object to manage the models, which have been moved out of StaticData.
The changes should have no effect on existing systems.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3394 1f5c12ca-751b-0410-a591-d2e778427230
2010-08-10 13:12:00 +00:00
hieuhoang1972
1a638746f3 SRI compile error in Ondrej's parallel backoff LM
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3201 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-26 12:02:28 +00:00
bojar
34a0e9b3a2 support for SRILM's factored language models, implemented by Michal Richter
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@3147 1f5c12ca-751b-0410-a591-d2e778427230
2010-04-20 15:25:52 +00:00
hieuhoang1972
c2da0faf05 vs build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@2180 1f5c12ca-751b-0410-a591-d2e778427230
2009-02-18 11:35:41 +00:00
redpony
3f7f12f4ad add client for remote language model
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1993 1f5c12ca-751b-0410-a591-d2e778427230
2009-01-22 21:31:17 +00:00
hieuhoang1972
789d6d96d1 intergrate randlm
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1935 1f5c12ca-751b-0410-a591-d2e778427230
2008-11-04 18:03:03 +00:00
hieuhoang1972
928d771085 create namespace
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1897 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-08 23:51:26 +00:00
hieuhoang1972
4f642808f1 move cube pruning moses lib to trunk
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1848 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-11 10:52:57 +00:00
hieuhoang1972
6615fe0302 delete old moses lib
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1847 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-11 10:52:01 +00:00
nicolabertoldi
a48be8f280 Handling dictionary upperbound for IRTS LMs through parameter -lmodel-dub
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1592 1f5c12ca-751b-0410-a591-d2e778427230
2008-04-04 15:52:45 +00:00
hieuhoang1972
6b611279d5 minor gcc compile error.
also, no longer use IRSTLM as a subsitutute for SRILM, and vice versa. They don't give identitcal results - avoids confusion.

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1229 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-21 20:05:27 +00:00
hieuhoang1972
7ecb0ce66e change unknown word processing to be closer to the way pharaoh does it - create unknown word whenever single word is not in translation table but penalise hypothesis for using it.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1227 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-21 19:48:53 +00:00
hieuhoang1972
f3cbacba3e code cleanup - make FactorCollection and StaticData totally accessible only globally
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1218 1f5c12ca-751b-0410-a591-d2e778427230
2007-02-16 18:08:37 +00:00
hieuhoang1972
a1f39c3ce7 lots of small changes and code clean up:
error catching/fail more gracefully on tables/lm load error & consistent user output
consistent debugging output
cleaned up timing functions
cleaned up mose/moses-cmd api calls/interaction
split up loading of all data in StaticData into separate fns
got binary phrase table to work under WIN32 & passed regression !!
added more comments
deleted phrase table filtering code
deleted mysql support
change calls to ToString() which might affect decoding to a call to a non-debugging fn instead, eg GetString()


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@988 1f5c12ca-751b-0410-a591-d2e778427230
2006-11-21 19:35:37 +00:00
hieuhoang1972
095f393250 return state arg from LM Internal
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@961 1f5c12ca-751b-0410-a591-d2e778427230
2006-11-08 15:19:03 +00:00
hieuhoang1972
bf7987e1ae added internal, x-platform LM which exactly mimics SRILM
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@955 1f5c12ca-751b-0410-a591-d2e778427230
2006-11-06 15:19:38 +00:00
hieuhoang1972
2789059b80 added internal, x-platform LM which exactly mimics SRILM
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@953 1f5c12ca-751b-0410-a591-d2e778427230
2006-11-06 15:01:30 +00:00
hieuhoang1972
786408a55c renamed chunking LM to skip LM
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@916 1f5c12ca-751b-0410-a591-d2e778427230
2006-10-24 16:27:13 +00:00
hieuhoang1972
d6f9458d59 code cleanup and commenting brought about when documenting for jhu report
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@886 1f5c12ca-751b-0410-a591-d2e778427230
2006-10-17 11:07:17 +00:00
hieuhoang1972
841e589ffd rename files for consistency
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@811 1f5c12ca-751b-0410-a591-d2e778427230
2006-09-09 17:22:56 +00:00
hieuhoang1972
467fc4c97c bug in creating factored LM
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@750 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-15 05:41:21 +00:00
hieuhoang1972
91521bd911 LM code cleanup
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@734 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-14 21:56:28 +00:00
hieuhoang1972
93af8a2780 joint multi-factor LM
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@733 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-14 21:30:19 +00:00
hieuhoang1972
e88ba116b9 clean up creating language models
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@621 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-10 18:04:29 +00:00
hieuhoang1972
8fe3826a52 clean up creating language models
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@620 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-10 17:39:32 +00:00
hieuhoang1972
a0484dd82f allows composite score producers. Chunking LM consists of 1 single factor LM
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@605 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-09 22:17:10 +00:00
hieuhoang1972
b867c0e866 factored lm
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@591 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-09 03:18:37 +00:00
hieuhoang1972
f304d8ae29 factored lm
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@526 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-07 03:18:57 +00:00
hieuhoang1972
283c44a95a factored lm
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@525 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-07 02:48:16 +00:00
hieuhoang1972
7b023dfdb8 chunking lm
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@516 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-05 01:01:46 +00:00
hieuhoang1972
4dc00ba5c9 chunking lm
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@512 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-05 00:36:37 +00:00
hieuhoang1972
f6d65fb38c chunking lm
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@511 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-05 00:28:38 +00:00
hieuhoang1972
548ab61d9c chunking lm
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@509 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-04 23:24:01 +00:00
hieuhoang1972
37b1123b19 fallback to another LM if what we have isn't compiled into lib
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@484 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-03 16:18:13 +00:00
hieuhoang1972
425e1673ff allow different LM implementations to be used
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@475 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-02 23:22:42 +00:00
hieuhoang1972
3bb1c85674 cvs headers & visual studio cleanup
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@400 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 02:46:59 +00:00
redpony
2c60d6b430 use pull LanguageModel creation logic into a LMFactory
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@217 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-20 15:30:27 +00:00