Commit Graph

2452 Commits

Author SHA1 Message Date
heafield
6b153c67f8 (16:51:52) Heafield: Does anybody use LanguageModelSkip?
(16:52:12) Hieu Hoang: not since jhu 2006
(16:52:17) Heafield: svn rm?  
(16:52:34) Hieu Hoang: aye. & see if anyone complains
(16:52:49) Hieu Hoang: & internal if u want to



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4352 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-13 16:01:00 +00:00
heafield
6bded791e6 Remove some virtual tags
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4351 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-13 15:34:37 +00:00
heafield
07e611ebcb Organize language models into an LM directory.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4350 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-13 14:27:01 +00:00
heafield
a95e791056 Back to using StringPiece
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4349 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-13 13:32:14 +00:00
heafield
f084248405 Cut the middle men out of the language model interface.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4348 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-13 12:33:05 +00:00
heafield
7d9bc523a6 Remove unused code
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4347 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-13 09:44:51 +00:00
heafield
541f776198 Remove unused calls
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4346 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-12 20:04:02 +00:00
heafield
e5d15a537e KenLM-specific Evaluate function
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4345 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-12 19:49:27 +00:00
heafield
cd19f14826 Faster CalcScore implementation for KenLM
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4339 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-12 13:04:12 +00:00
heafield
81acd0ffa2 Dear Hieu, a StringPiece is not necessairly null-terminated. When loading ARPA files directly, it was copying the ARPA file as
part of the vocabulary word and breaking everything.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4338 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-12 11:45:46 +00:00
heafield
c3f2ef7b25 Fix bhaddow's oovCount. Should be all words, not just the first in the phrase.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4337 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-12 10:22:45 +00:00
hieuhoang1972
b88fad16f8 create valid html header, according to Tomas Hudik
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4336 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-12 10:18:36 +00:00
heafield
15adb17e35 Move EnumerateVocab to namespace lm
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4335 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-12 10:18:23 +00:00
hieuhoang1972
a65efa5a60 relax overly harsh assert
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4334 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-12 10:12:49 +00:00
heafield
19f3f09a39 Updated left state minimization makes all states of length N-1 full
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4332 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-11 18:40:00 +00:00
heafield
86f1d3ec71 Fix trie for ARPAs from SRILM.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4331 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-11 18:27:36 +00:00
heafield
ba41862d37 Source files are not executables.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4330 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-11 16:07:16 +00:00
heafield
16e37adbe0 Move phrase scoring from LanguageModel to LanguageModelImplementation.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4324 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-11 13:50:44 +00:00
heafield
c9995dc44c Trie building bug fix
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4323 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-11 10:12:17 +00:00
hieuhoang1972
b0e5d6c005 delete align info flag in target phrase. Not used
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4322 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-11 08:55:19 +00:00
hieuhoang1972
ea4db80473 extract lex probability from gzip files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4321 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-11 06:49:19 +00:00
heafield
8f0c841d28 Move ChartHypothesis stuff to LanguageModelImplementation. Ran the
regression tests. . . the passes and fails are the same FWIW.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4319 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-10 16:25:56 +00:00
heafield
5a0d84da9a Move LanguageModelChartState into LanguageModelImplementation in preparation for moving responsibility for boundary word tracking
from ChartHypothesis to LanguageModelChartState.  



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4316 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-10 11:15:13 +00:00
hieuhoang1972
235dda25e7 extractor wrapper to make it work on SGE
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4315 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-10 04:28:55 +00:00
heafield
71d0d389c5 Fix silly bug in merging
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4314 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-08 10:59:54 +00:00
hieuhoang1972
96c8ff4b15 last checkin was for the oldest bug found in moses! Goes back to svn version 4, and prob earlier, but svn can't diff that far.
Should have crapped out whenever there is a blank line in the ini file, which is basically every ini file. Only visual studio 2010 complained, and only recently. Very strange, and a bit worrying. Horaay anyway

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4304 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-06 15:06:59 +00:00
servan
eef0f213e9 A mert/MergeScorer.h
A    mert/MergeScorer.cpp


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4303 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-06 11:00:48 +00:00
heafield
9ba5460e53 Apparently we wanted a sequential id after all. . . get one in a thread-safe way from the manager.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4302 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-06 10:31:09 +00:00
hieuhoang1972
6f22c2ae29 bug reading over string size
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4301 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-06 09:16:21 +00:00
servan
b00286b773 M scripts/released-files
M    scripts/Makefile


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4300 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-05 13:37:35 +00:00
servan
f223f5a276 M mert/TerScorer.cpp
M    mert/BleuScorer.h
M    mert/ScorerFactory.h
M    mert/Scorer.h
M    mert/PerScorer.h
M    mert/TerScorer.h
M    mert/Makefile.am
AM   scripts/training/mert-moses-multi.pl


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4299 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-05 13:36:17 +00:00
phkoehn
568a8cc0f4 fixed broken sparse feature training
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4298 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-05 10:29:01 +00:00
bhaddow
210f87bebd Support for lattice sampling (use -lattice-samples n)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4296 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-04 20:45:47 +00:00
bhaddow
84d73700af Implementation of Lattice sampling (Chatterjee and Cancedda, emnlp 2010)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4295 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-04 15:46:24 +00:00
nicolabertoldi
23d9a9b55e normalization of output spaces before and after field separator
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4293 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-03 16:11:39 +00:00
nicolabertoldi
47e452a076 made LM interface compliant with IRSTLM 5.70.02; fixed a bug related to word encoding
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4292 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-03 16:02:02 +00:00
hieuhoang1972
1ea3acde3d compile error due to last commit
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4291 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-03 08:31:05 +00:00
hieuhoang1972
3dafb3589c visual studio
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4290 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-03 07:18:14 +00:00
hieuhoang1972
7fa74c1eb2 roll back kenlm tests. Binarizy files are OS-dependent
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4287 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-01 12:22:24 +00:00
hieuhoang1972
e1c808ad9a roll back kenlm tests. Binarizy files are OS-dependent
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4286 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-01 11:33:19 +00:00
hieuhoang1972
6faf20707c roll back kenlm tests. Binarizy files are OS-dependent
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4285 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-01 11:09:19 +00:00
hieuhoang1972
f51239cf68 kenlm regression tests
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4284 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-30 14:12:18 +00:00
bhaddow
521d7b2199 argument for sort buffer size
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4282 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-29 15:11:31 +00:00
nicolabertoldi
2838970fc0 changed the interface towards IRSTLM according to the recent changes in FactorCollection
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4280 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-28 18:21:55 +00:00
hieuhoang1972
7538f5406a xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4278 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-28 00:30:18 +00:00
pjwilliams
fc0ab40bb0 Add tool for converting rule table to compact format.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4274 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-27 01:20:18 +00:00
pjwilliams
ea272dc198 Move SCFG rule table loading code out of PhraseDictionarySCFG and into a
separate RuleTableLoader class.  Start adding support for a faster-loading
rule table format.

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4273 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-27 00:34:46 +00:00
heafield
41cc547360 Fix a segfault with the chop variant of trie in chart decoding
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4272 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-26 20:54:41 +00:00
jhclark
d6b15ff038 Hieu reports previous zlib check is borken for him. Let's see if this flies.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4271 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-26 19:34:54 +00:00
jhclark
7538a2db0c Check for zlib, since we use it. Be more helpful if it isn't there.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4270 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-26 15:42:18 +00:00