Commit Graph

2316 Commits

Author SHA1 Message Date
bgottesman
9d9977bc6f add TODO tests for detokenization of Chinese and Japanese
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4131 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 13:21:05 +00:00
bgottesman
c030dae094 Allow a test case to have an undefined language, since the detokenizer doesn't require a language to be passed in and, indeed, errors if a language is passed in for which there are no special rules (which seems dubious to me ...). Add test case TEST_GERMAN_NONASCII with an undefined language.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4130 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 19:14:01 +00:00
theleopardess
d7752b44fc I tested check-in by adding a test line in moses/src/StaticData.cpp, producing a trivial moses revision 4122. Now I have removed that line. Everything ok but sorry for the confusion.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4129 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 18:57:09 +00:00
bgottesman
024bbe0bcc - factor out class DetokenizerTestCase
- create an array of all of the test cases before running any of them
- in the case of an expected failure, move the TODO block deeper, just around the validation of the results

I'm not 100% I like this change, I think it makes the code slightly more elegant but it also makes it longer.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4128 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 18:48:37 +00:00
bgottesman
d521287a3f move commas to after here-docs, to hopefully make test cases more readable; and remove unused import
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4125 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 16:37:49 +00:00
bgottesman
76c3ef4dba a few more detokenization tests, including a TODO one that exposes a bug
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4124 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 16:23:47 +00:00
theleopardess
f8a99e5d6d yanggao-softdep-v0
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4122 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 15:11:43 +00:00
bgottesman
eda0f4e370 An initial test suite for detokenizer.perl.
I realize this doesn't quite fit the paradigm if the existing moses test suite.  On the other hand, it's self-contained, easy to run, easy to add tests to (just follow the pattern in the section titled 'Definitions of individual test cases'), and uses an established Perl testing framework.  I don't think it will be infeasible to incorporate it into the existing test suite.

Usage:

run-test-detokenizer.t --results-dir <RESULTS-DIRECTORY>

where <RESULTS-DIRECTORY> is an empty existing directory where the output can be written


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4121 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 14:32:39 +00:00
hieuhoang1972
30ca534b86 faster scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4119 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 10:27:15 +00:00
hieuhoang1972
b4c79f721e regression test for scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4118 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 10:18:07 +00:00
hieuhoang1972
b618aadf8d regression test for scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4117 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 09:23:48 +00:00
hieuhoang1972
b8a0b09206 regression test for scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4116 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 02:48:30 +00:00
bojar
779873a2a2 merged Philipp's updates up to r4106 inclusive
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4115 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 23:05:18 +00:00
bojar
7a301a7b5a negligible polishing
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4114 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 22:24:35 +00:00
hieuhoang1972
fc176801d6 regression test for score
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4112 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 09:15:43 +00:00
hieuhoang1972
e988361d62 regression test for score
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4111 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 08:11:40 +00:00
hieuhoang1972
cdbb850cc3 fix new scorer to output phrase pairs in same order as old scorer
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4110 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 07:36:25 +00:00
hieuhoang1972
e7b97c1b1a vs build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4109 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 04:53:21 +00:00
heafield
61974ad75e Minor fixes. One for David Chiang who has files without initial newlines.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4108 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 19:46:19 +00:00
phkoehn
36db0ffe48 added pairwise ranked optimization (PRO) as proposd by [Hopkins&May,2011], just use switch --pairwise-ranked
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4106 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 17:00:17 +00:00
nicolabertoldi
579d8b0760 added few regression tests explicitly working with IRSTLM; modified few regression tests wrongly working with IRSTLM/SRILM; modified the required data archive (now version is 6);
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4105 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 09:19:48 +00:00
hieuhoang1972
49e56f35bb regression test for score
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4102 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 10:28:36 +00:00
hieuhoang1972
d45a29d9c7 data for score regression test
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4101 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 10:10:09 +00:00
hieuhoang1972
ed4367ceb0 data for score regression test
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4100 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 10:04:53 +00:00
hieuhoang1972
69fe991923 data for score regression test
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4099 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 09:53:25 +00:00
hieuhoang1972
1ae8c53a08 executable perl script
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4098 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 09:31:01 +00:00
hieuhoang1972
acb7e984de starting regression test for score program
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4097 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 09:27:59 +00:00
hieuhoang1972
65f7ffb783 delete debug message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4096 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:58:05 +00:00
hieuhoang1972
59771e8dbe delete debug message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4095 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:27:06 +00:00
hieuhoang1972
e389e9fec7 default decoders if none specified
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4094 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:14:33 +00:00
hieuhoang1972
677378774a optmised version of score program. Original version is slow when source phrase has many target phrases 'cos it scans a large vector. New version puts it into a set. Slight hack in that it const_cast to get items out of the set. For a source with 100k targets, took 1.2sec, versus 2m20sec. Current version can can take days to run. Won't make it the main score program until regression test for score is set up
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4093 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 09:40:58 +00:00
hieuhoang1972
3b1dac4178 start on speed optimisation for scoring
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4092 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 07:55:03 +00:00
hieuhoang1972
876ad74dbd create reverse phrase table. Not ready for prime time
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4091 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-26 09:12:58 +00:00
hieuhoang1972
a79651d239 fixed backoff phrase table. Allow backoff of unigrams
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4089 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-25 12:23:49 +00:00
hieuhoang1972
b0ec298ce2 vs.net build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4088 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-23 23:52:34 +00:00
hieuhoang1972
068c17f368 vs.net build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4087 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-23 23:26:56 +00:00
phkoehn
1bd74fc87f added random directions [Cer&al.,2008] and historic best as starting points [Foster&Kuhn,2009] to MERT
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4086 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-23 00:24:45 +00:00
hieuhoang1972
6a27dc4f17 example of how to run
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4084 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-22 08:32:09 +00:00
chesio
1b9d99a5ad BilingualDynSuffixArray corpus may now be loaded from gzipped file as well (removed needless call to seekg()).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4083 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-21 23:29:11 +00:00
chesio
4918003635 absolutize_moses_model and clone_moses_model are now aware of suffix arrays format.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4082 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-21 23:15:08 +00:00
hieuhoang1972
06af5d40d4 Improved error message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4081 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-21 02:41:23 +00:00
pjwilliams
113d0f24dd moses_chart: avoid doing some std::map retrievals during rule lookup
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4080 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-19 12:57:02 +00:00
hieuhoang1972
9c0d725cde visual studio 2010
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4079 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-19 03:07:15 +00:00
pjwilliams
beba4b475f moses_chart: merge DottedRule and CoveredChartSpan classes. This saves
some memory for models that require a lot of lookup state (generally
grammars with lots of target categories).

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4078 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-18 21:44:27 +00:00
hieuhoang1972
1190b75528 consolidate gzip files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4077 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-18 05:08:26 +00:00
hieuhoang1972
fd08431e3b xcode
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4076 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-15 12:09:33 +00:00
hieuhoang1972
e174b5dea2 fix by Nicola Bertoldi for lexical probability calculation. Previous implementation was sensitive to double spaces and spaces at the beginning of the sentence, counting a space after another space as a word.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4075 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-14 11:26:35 +00:00
heafield
954dfd7d5e Optional compression for trie. Also, some better error handling.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4074 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-13 20:53:18 +00:00
bhaddow
846748fa3f A more helpful error message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4072 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-08 20:07:07 +00:00
bojar
66b71a7f5c Ondrej's little tools to examine weight settings
not quite fit for public use, esp. the -summarize.sh one...


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4071 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-08 00:11:10 +00:00