bgottesman
9d9977bc6f
add TODO tests for detokenization of Chinese and Japanese
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4131 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-08 13:21:05 +00:00
bgottesman
c030dae094
Allow a test case to have an undefined language, since the detokenizer doesn't require a language to be passed in and, indeed, errors if a language is passed in for which there are no special rules (which seems dubious to me ...). Add test case TEST_GERMAN_NONASCII with an undefined language.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4130 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 19:14:01 +00:00
theleopardess
d7752b44fc
I tested check-in by adding a test line in moses/src/StaticData.cpp, producing a trivial moses revision 4122. Now I have removed that line. Everything ok but sorry for the confusion.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4129 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 18:57:09 +00:00
bgottesman
024bbe0bcc
- factor out class DetokenizerTestCase
...
- create an array of all of the test cases before running any of them
- in the case of an expected failure, move the TODO block deeper, just around the validation of the results
I'm not 100% I like this change, I think it makes the code slightly more elegant but it also makes it longer.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4128 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 18:48:37 +00:00
bgottesman
d521287a3f
move commas to after here-docs, to hopefully make test cases more readable; and remove unused import
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4125 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 16:37:49 +00:00
bgottesman
76c3ef4dba
a few more detokenization tests, including a TODO one that exposes a bug
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4124 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 16:23:47 +00:00
theleopardess
f8a99e5d6d
yanggao-softdep-v0
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4122 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 15:11:43 +00:00
bgottesman
eda0f4e370
An initial test suite for detokenizer.perl.
...
I realize this doesn't quite fit the paradigm if the existing moses test suite. On the other hand, it's self-contained, easy to run, easy to add tests to (just follow the pattern in the section titled 'Definitions of individual test cases'), and uses an established Perl testing framework. I don't think it will be infeasible to incorporate it into the existing test suite.
Usage:
run-test-detokenizer.t --results-dir <RESULTS-DIRECTORY>
where <RESULTS-DIRECTORY> is an empty existing directory where the output can be written
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4121 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 14:32:39 +00:00
hieuhoang1972
30ca534b86
faster scorer
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4119 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 10:27:15 +00:00
hieuhoang1972
b4c79f721e
regression test for scorer
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4118 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 10:18:07 +00:00
hieuhoang1972
b618aadf8d
regression test for scorer
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4117 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 09:23:48 +00:00
hieuhoang1972
b8a0b09206
regression test for scorer
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4116 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-05 02:48:30 +00:00
bojar
779873a2a2
merged Philipp's updates up to r4106 inclusive
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4115 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 23:05:18 +00:00
bojar
7a301a7b5a
negligible polishing
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4114 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 22:24:35 +00:00
hieuhoang1972
fc176801d6
regression test for score
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4112 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 09:15:43 +00:00
hieuhoang1972
e988361d62
regression test for score
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4111 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 08:11:40 +00:00
hieuhoang1972
cdbb850cc3
fix new scorer to output phrase pairs in same order as old scorer
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4110 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 07:36:25 +00:00
hieuhoang1972
e7b97c1b1a
vs build
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4109 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-04 04:53:21 +00:00
heafield
61974ad75e
Minor fixes. One for David Chiang who has files without initial newlines.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4108 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 19:46:19 +00:00
phkoehn
36db0ffe48
added pairwise ranked optimization (PRO) as proposd by [Hopkins&May,2011], just use switch --pairwise-ranked
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4106 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 17:00:17 +00:00
nicolabertoldi
579d8b0760
added few regression tests explicitly working with IRSTLM; modified few regression tests wrongly working with IRSTLM/SRILM; modified the required data archive (now version is 6);
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4105 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-03 09:19:48 +00:00
hieuhoang1972
49e56f35bb
regression test for score
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4102 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 10:28:36 +00:00
hieuhoang1972
d45a29d9c7
data for score regression test
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4101 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 10:10:09 +00:00
hieuhoang1972
ed4367ceb0
data for score regression test
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4100 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 10:04:53 +00:00
hieuhoang1972
69fe991923
data for score regression test
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4099 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 09:53:25 +00:00
hieuhoang1972
1ae8c53a08
executable perl script
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4098 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 09:31:01 +00:00
hieuhoang1972
acb7e984de
starting regression test for score program
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4097 1f5c12ca-751b-0410-a591-d2e778427230
2011-08-02 09:27:59 +00:00
hieuhoang1972
65f7ffb783
delete debug message
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4096 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:58:05 +00:00
hieuhoang1972
59771e8dbe
delete debug message
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4095 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:27:06 +00:00
hieuhoang1972
e389e9fec7
default decoders if none specified
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4094 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 10:14:33 +00:00
hieuhoang1972
677378774a
optmised version of score program. Original version is slow when source phrase has many target phrases 'cos it scans a large vector. New version puts it into a set. Slight hack in that it const_cast to get items out of the set. For a source with 100k targets, took 1.2sec, versus 2m20sec. Current version can can take days to run. Won't make it the main score program until regression test for score is set up
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4093 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 09:40:58 +00:00
hieuhoang1972
3b1dac4178
start on speed optimisation for scoring
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4092 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-27 07:55:03 +00:00
hieuhoang1972
876ad74dbd
create reverse phrase table. Not ready for prime time
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4091 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-26 09:12:58 +00:00
hieuhoang1972
a79651d239
fixed backoff phrase table. Allow backoff of unigrams
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4089 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-25 12:23:49 +00:00
hieuhoang1972
b0ec298ce2
vs.net build
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4088 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-23 23:52:34 +00:00
hieuhoang1972
068c17f368
vs.net build
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4087 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-23 23:26:56 +00:00
phkoehn
1bd74fc87f
added random directions [Cer&al.,2008] and historic best as starting points [Foster&Kuhn,2009] to MERT
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4086 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-23 00:24:45 +00:00
hieuhoang1972
6a27dc4f17
example of how to run
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4084 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-22 08:32:09 +00:00
chesio
1b9d99a5ad
BilingualDynSuffixArray corpus may now be loaded from gzipped file as well (removed needless call to seekg()).
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4083 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-21 23:29:11 +00:00
chesio
4918003635
absolutize_moses_model and clone_moses_model are now aware of suffix arrays format.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4082 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-21 23:15:08 +00:00
hieuhoang1972
06af5d40d4
Improved error message
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4081 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-21 02:41:23 +00:00
pjwilliams
113d0f24dd
moses_chart: avoid doing some std::map retrievals during rule lookup
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4080 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-19 12:57:02 +00:00
hieuhoang1972
9c0d725cde
visual studio 2010
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4079 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-19 03:07:15 +00:00
pjwilliams
beba4b475f
moses_chart: merge DottedRule and CoveredChartSpan classes. This saves
...
some memory for models that require a lot of lookup state (generally
grammars with lots of target categories).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4078 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-18 21:44:27 +00:00
hieuhoang1972
1190b75528
consolidate gzip files
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4077 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-18 05:08:26 +00:00
hieuhoang1972
fd08431e3b
xcode
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4076 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-15 12:09:33 +00:00
hieuhoang1972
e174b5dea2
fix by Nicola Bertoldi for lexical probability calculation. Previous implementation was sensitive to double spaces and spaces at the beginning of the sentence, counting a space after another space as a word.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4075 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-14 11:26:35 +00:00
heafield
954dfd7d5e
Optional compression for trie. Also, some better error handling.
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4074 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-13 20:53:18 +00:00
bhaddow
846748fa3f
A more helpful error message
...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4072 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-08 20:07:07 +00:00
bojar
66b71a7f5c
Ondrej's little tools to examine weight settings
...
not quite fit for public use, esp. the -summarize.sh one...
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4071 1f5c12ca-751b-0410-a591-d2e778427230
2011-07-08 00:11:10 +00:00