Commit Graph

2310 Commits

Author SHA1 Message Date
theleopardess
f8a99e5d6d yanggao-softdep-v0
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4122 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 15:11:43 +00:00
bgottesman
eda0f4e370 An initial test suite for detokenizer.perl.
I realize this doesn't quite fit the paradigm if the existing moses test suite.  On the other hand, it's self-contained, easy to run, easy to add tests to (just follow the pattern in the section titled 'Definitions of individual test cases'), and uses an established Perl testing framework.  I don't think it will be infeasible to incorporate it into the existing test suite.

Usage:

run-test-detokenizer.t --results-dir <RESULTS-DIRECTORY>

where <RESULTS-DIRECTORY> is an empty existing directory where the output can be written


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4121 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 14:32:39 +00:00
hieuhoang1972
30ca534b86 faster scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4119 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 10:27:15 +00:00
hieuhoang1972
b4c79f721e regression test for scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4118 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 10:18:07 +00:00
hieuhoang1972
b618aadf8d regression test for scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4117 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 09:23:48 +00:00
hieuhoang1972
b8a0b09206 regression test for scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4116 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 02:48:30 +00:00
bojar
779873a2a2 merged Philipp's updates up to r4106 inclusive
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4115 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 23:05:18 +00:00
bojar
7a301a7b5a negligible polishing
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4114 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 22:24:35 +00:00
hieuhoang1972
fc176801d6 regression test for score
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4112 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 09:15:43 +00:00
hieuhoang1972
e988361d62 regression test for score
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4111 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 08:11:40 +00:00
hieuhoang1972
cdbb850cc3 fix new scorer to output phrase pairs in same order as old scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4110 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 07:36:25 +00:00
hieuhoang1972
e7b97c1b1a vs build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4109 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 04:53:21 +00:00
heafield
61974ad75e Minor fixes. One for David Chiang who has files without initial newlines.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4108 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 19:46:19 +00:00
phkoehn
36db0ffe48 added pairwise ranked optimization (PRO) as proposd by [Hopkins&May,2011], just use switch --pairwise-ranked
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4106 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 17:00:17 +00:00
nicolabertoldi
579d8b0760 added few regression tests explicitly working with IRSTLM; modified few regression tests wrongly working with IRSTLM/SRILM; modified the required data archive (now version is 6);
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4105 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 09:19:48 +00:00
hieuhoang1972
49e56f35bb regression test for score
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4102 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 10:28:36 +00:00
hieuhoang1972
d45a29d9c7 data for score regression test
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4101 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 10:10:09 +00:00
hieuhoang1972
ed4367ceb0 data for score regression test
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4100 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 10:04:53 +00:00
hieuhoang1972
69fe991923 data for score regression test
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4099 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 09:53:25 +00:00
hieuhoang1972
1ae8c53a08 executable perl script
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4098 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 09:31:01 +00:00
hieuhoang1972
acb7e984de starting regression test for score program
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4097 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 09:27:59 +00:00
hieuhoang1972
65f7ffb783 delete debug message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4096 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:58:05 +00:00
hieuhoang1972
59771e8dbe delete debug message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4095 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:27:06 +00:00
hieuhoang1972
e389e9fec7 default decoders if none specified
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4094 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:14:33 +00:00
hieuhoang1972
677378774a optmised version of score program. Original version is slow when source phrase has many target phrases 'cos it scans a large vector. New version puts it into a set. Slight hack in that it const_cast to get items out of the set. For a source with 100k targets, took 1.2sec, versus 2m20sec. Current version can can take days to run. Won't make it the main score program until regression test for score is set up
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4093 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 09:40:58 +00:00
hieuhoang1972
3b1dac4178 start on speed optimisation for scoring
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4092 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 07:55:03 +00:00
hieuhoang1972
876ad74dbd create reverse phrase table. Not ready for prime time
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4091 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-26 09:12:58 +00:00
hieuhoang1972
a79651d239 fixed backoff phrase table. Allow backoff of unigrams
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4089 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-25 12:23:49 +00:00
hieuhoang1972
b0ec298ce2 vs.net build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4088 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-23 23:52:34 +00:00
hieuhoang1972
068c17f368 vs.net build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4087 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-23 23:26:56 +00:00
phkoehn
1bd74fc87f added random directions [Cer&al.,2008] and historic best as starting points [Foster&Kuhn,2009] to MERT
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4086 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-23 00:24:45 +00:00
hieuhoang1972
6a27dc4f17 example of how to run
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4084 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-22 08:32:09 +00:00
chesio
1b9d99a5ad BilingualDynSuffixArray corpus may now be loaded from gzipped file as well (removed needless call to seekg()).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4083 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-21 23:29:11 +00:00
chesio
4918003635 absolutize_moses_model and clone_moses_model are now aware of suffix arrays format.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4082 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-21 23:15:08 +00:00
hieuhoang1972
06af5d40d4 Improved error message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4081 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-21 02:41:23 +00:00
pjwilliams
113d0f24dd moses_chart: avoid doing some std::map retrievals during rule lookup
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4080 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-19 12:57:02 +00:00
hieuhoang1972
9c0d725cde visual studio 2010
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4079 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-19 03:07:15 +00:00
pjwilliams
beba4b475f moses_chart: merge DottedRule and CoveredChartSpan classes. This saves
some memory for models that require a lot of lookup state (generally
grammars with lots of target categories).

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4078 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-18 21:44:27 +00:00
hieuhoang1972
1190b75528 consolidate gzip files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4077 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-18 05:08:26 +00:00
hieuhoang1972
fd08431e3b xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4076 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-15 12:09:33 +00:00
hieuhoang1972
e174b5dea2 fix by Nicola Bertoldi for lexical probability calculation. Previous implementation was sensitive to double spaces and spaces at the beginning of the sentence, counting a space after another space as a word.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4075 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-14 11:26:35 +00:00
heafield
954dfd7d5e Optional compression for trie. Also, some better error handling.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4074 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-13 20:53:18 +00:00
bhaddow
846748fa3f A more helpful error message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4072 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-08 20:07:07 +00:00
bojar
66b71a7f5c Ondrej's little tools to examine weight settings
not quite fit for public use, esp. the -summarize.sh one...


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4071 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-08 00:11:10 +00:00
bhaddow
8ffbe2389e Fix bug in handling of the score options
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4070 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-07 16:31:16 +00:00
hieuhoang1972
c1991f8a27 rewrite of lex prob calculation
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4069 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-07 09:29:03 +00:00
leven101
cd96c02748 bug fixes
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4068 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-06 17:25:54 +00:00
bojar
a57e71a13f cherrypick r3919: zero jobs means serial
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4067 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-03 21:41:19 +00:00
bojar
1ba2de3c02 - cmert: added support for passing min and max values for weights
(used to be in old cmert but not in new cmert, i.e. moses/mert/)
- modified mert-moses.pl accordingly, esp. set min&max to 0&1 as it used to be
  hardwired in the new cmert
- adding mert-moses-ondrej.pl, a simplification of mert-moses.pl, please test it


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4066 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-03 21:01:16 +00:00
leven101
52ce926901 added ClearWordInCache() to clear (nonfrequent) lexical word pair probs after suffix array updates
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4065 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-01 15:40:48 +00:00