Commit Graph

130 Commits

Author SHA1 Message Date
Phil Williams
4a68e5f9e7 extract-ghkm: add support for XML parse tree files that
use the "span" attribute (like those produced by the
relax-parse tool).
2012-04-23 14:24:54 +01:00
phikoehn
2c520fb93c multi-threaded hierarchical rule extractor 2012-04-17 05:54:48 +01:00
Hieu Hoang
91b2804fbf xcode 2012-03-15 13:26:54 +07:00
Hieu Hoang
b8c1c53e2b visual studio 2012-02-08 17:57:36 +07:00
Hieu Hoang
8efd2cb558 visual studio 2012-02-08 13:26:39 +07:00
Hieu Hoang
be1a26c482 input file can be gzipped 2012-02-08 12:51:55 +07:00
Michal Hrusecky
8ab6c7a655 Alway return something in non-void functions
There were functions defined as non-void but didn't have return value
for all possible passes. This can result in undefined behavior. Fixed
this issue and returning values that somehow makes sense hopefully.
2012-02-01 14:03:49 +01:00
bhaddow
ddd3d97bcb Restore Hieu's phrase extraction speedups 2012-01-12 14:34:52 +00:00
pkoehn
3060da747e bug fix with Good Turing / Kneser Ney discounting 2012-01-09 23:43:08 +00:00
Hieu Hoang
6fda48eaef error message 2012-01-04 23:29:31 +07:00
Kenneth Heafield
9ab49bced2 Assume people want all the executables and some examples installed. Also we're not in cvs anymore. 2011-11-30 15:22:21 -05:00
Kenneth Heafield
501b9a05bf Jam to install scripts (doesn't modify them yet). 2011-11-25 14:23:10 +00:00
Kenneth Heafield
613461e5d6 More Jamfiles for training 2011-11-25 10:34:34 +00:00
Kenneth Heafield
fac8f7d751 Jam phrase extraction 2011-11-25 10:26:19 +00:00
Phil Williams
ee0a6dbd5c extract-ghkm: add the features required for use as a drop-in replacement
for extract-rules: composed rules, unaligned source word attachment,
non-lexical unary rule elimination, glue rule generation, unknown word
label generation, and EMS integration.
2011-11-21 16:21:04 +00:00
Hieu Hoang
87500bc93d visual studio 2011-11-11 22:24:25 +00:00
hieuhoang1972
897fe0f88b visual studio
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4356 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-14 10:50:08 +00:00
hieuhoang1972
57bf51fd05 all programs in training can take in gzipped file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4354 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-13 18:57:23 +00:00
hieuhoang1972
ea4db80473 extract lex probability from gzip files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4321 1f5c12ca-751b-0410-a591-d2e778427230
2011-10-11 06:49:19 +00:00
hieuhoang1972
d2245390e0 visual studio build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4250 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-22 05:39:32 +00:00
hieuhoang1972
08805d6a9c comment revert
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4246 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-21 11:04:48 +00:00
hieuhoang1972
659e34735d xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4245 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-21 11:01:19 +00:00
hieuhoang1972
4313e335b5 print out span widths of non-terms. Extra argument --OutputNTLengths
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4230 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-16 17:13:34 +00:00
bhaddow
4d5b17f444 Option to create extract file with sentence ids
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4229 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-16 15:37:02 +00:00
hieuhoang1972
1e1eb4d29e print out span widths of non-terms. Extra argument --OutputNTLengths
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4225 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-14 18:16:05 +00:00
hieuhoang1972
149208ecba print out span widths of non-terms. Extra argument --OutputNTLengths
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4224 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-14 10:23:14 +00:00
hieuhoang1972
d68274d217 print out span widths of non-terms. Extra argument --OutputNTLengths
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4223 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-14 07:15:36 +00:00
hieuhoang1972
b1ca5e1fc8 xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4222 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-13 19:01:04 +00:00
hieuhoang1972
b8606b3e70 xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4221 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-13 18:51:28 +00:00
bhaddow
2c585ce6e7 restore
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4186 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-07 16:42:46 +00:00
bhaddow
de51b69d03 remove (temporarily)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4185 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-07 16:40:55 +00:00
phkoehn
41a1849437 support for sparse feature functions (mert support only when using PRO)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4184 1f5c12ca-751b-0410-a591-d2e778427230
2011-09-07 16:37:33 +00:00
hieuhoang1972
30ca534b86 faster scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4119 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 10:27:15 +00:00
hieuhoang1972
cdbb850cc3 fix new scorer to output phrase pairs in same order as old scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4110 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 07:36:25 +00:00
hieuhoang1972
65f7ffb783 delete debug message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4096 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:58:05 +00:00
hieuhoang1972
59771e8dbe delete debug message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4095 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:27:06 +00:00
hieuhoang1972
677378774a optmised version of score program. Original version is slow when source phrase has many target phrases 'cos it scans a large vector. New version puts it into a set. Slight hack in that it const_cast to get items out of the set. For a source with 100k targets, took 1.2sec, versus 2m20sec. Current version can can take days to run. Won't make it the main score program until regression test for score is set up
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4093 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 09:40:58 +00:00
hieuhoang1972
3b1dac4178 start on speed optimisation for scoring
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4092 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 07:55:03 +00:00
hieuhoang1972
876ad74dbd create reverse phrase table. Not ready for prime time
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4091 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-26 09:12:58 +00:00
hieuhoang1972
a79651d239 fixed backoff phrase table. Allow backoff of unigrams
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4089 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-25 12:23:49 +00:00
hieuhoang1972
068c17f368 vs.net build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4087 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-23 23:26:56 +00:00
hieuhoang1972
9c0d725cde visual studio 2010
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4079 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-19 03:07:15 +00:00
hieuhoang1972
1190b75528 consolidate gzip files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4077 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-18 05:08:26 +00:00
hieuhoang1972
c1991f8a27 rewrite of lex prob calculation
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4069 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-07 09:29:03 +00:00
hieuhoang1972
126739f3f1 debug info
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4063 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 10:33:04 +00:00
hieuhoang1972
4d66952b9b gcc
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4062 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 10:27:47 +00:00
hieuhoang1972
efc9c77de6 lex prob, almost working
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4061 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 10:22:55 +00:00
hieuhoang1972
d72b7cde92 makefile
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4060 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 09:06:17 +00:00
hieuhoang1972
dd9a9b6e43 vs.net
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4059 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 05:48:44 +00:00
hieuhoang1972
8595b06dce rewrite lex prob calc
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4058 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 05:40:46 +00:00