Commit Graph

375 Commits

Author SHA1 Message Date
bojar
840441dc1a die if phrase mismatch discovered
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@688 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-13 05:35:02 +00:00
bojar
f246845489 utf8 output
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@686 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-13 02:44:28 +00:00
bojar
6fc349f75f gives nice overview of model complexity (in terms of ambiguity in translation and generation tables)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@670 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-11 22:33:48 +00:00
bojar
e6914693a1 reports also the top N words
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@668 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-11 21:48:39 +00:00
bojar
8f504a1d9b a handy script to count words that passed through the decoder unchanged (mostly because they're unknown); can exclude numbers and punctuation
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@667 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-11 21:26:24 +00:00
callison-burch
fce87ded03 Removed the .pyc files that were preventing the command 'make release' from executing properly.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@658 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-11 18:16:01 +00:00
bojar
e1936af681 marking finished_step also after last iteration finished
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@655 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-11 17:05:04 +00:00
bojar
75194c441d just a typo
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@647 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-11 15:50:23 +00:00
bojar
68ef1413cd allows arbitrary mixing of 'kept' and 'added' factors in output
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@627 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-10 22:00:02 +00:00
bojar
b65eafacc6 die if no refs found, report also number of refs and sents used
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@622 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-10 18:53:12 +00:00
bojar
15566bb58a utf8, support for printing source, too
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@618 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-10 14:35:09 +00:00
bojar
9b23b6d9c8 die in safesystem on child's death
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@612 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-10 01:42:49 +00:00
bojar
af1be61259 die when there are no phrases in input
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@611 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-10 01:37:29 +00:00
bojar
3deea84ccb adding cvsignore to ignore python-compiled files
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@609 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-10 00:27:34 +00:00
swadey
683435e058 - updated bleu and score-nbest to allow optional bypass of NIST-style normalization
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@608 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-10 00:24:19 +00:00
phkoehn
0595062d7d fixed error message on scripts root dir
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@607 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-09 23:52:50 +00:00
redpony
0ea85deef7 fix off-by-one error in tables-score (prevents null characters from being inserted)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@595 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-09 19:37:14 +00:00
eherbst
cf8c271469 minor, and moved stuff around
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@588 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-08 23:38:45 +00:00
bojar
e97b542717 added --debug mode to training script to keep all intermediate files
exit status of extract and score are 1 on error, not zero


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@585 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-08 22:28:26 +00:00
redpony
523527fa17 get rid of profiling
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@573 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-08 20:16:06 +00:00
redpony
db5a6bd11e fix bug that prevents | and _ from being tokenized properly.
fix bug in --parallel


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@572 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-08 20:13:57 +00:00
bojar
81ddb0e4f9 added train-factored... to releases, added dependency on our copy of phrase-extract
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@569 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-08 19:22:53 +00:00
bojar
303f411387 simplified Makefile, removed duplicit implementation of tokenize()
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@568 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-08 19:04:59 +00:00
phkoehn
b83fc72dd2 initial version of phrase-extract and phrase-score used by training script
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@567 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-08 18:54:28 +00:00
bojar
264f045a6b fixing ensure_absolute
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@556 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-08 03:41:50 +00:00
bojar
0541ce3689 just cleanup of variable initialization
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@555 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-08 01:54:50 +00:00
bojar
5290653a4d Added reduce_combine to release
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@554 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-08 01:26:08 +00:00
eherbst
384f8ccb07 adding sentence-by-sentence.pl: display all sentences in a corpus, system output vs. reference
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@552 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-08 00:09:40 +00:00
bojar
64ec2e5ca4 checking in multi-bleu.perl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@551 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-07 23:44:35 +00:00
bojar
ab5bb31797 allowing to override default paths
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@547 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-07 23:04:33 +00:00
bojar
26ce21f29b fixed unintended structure-sharing bug
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@541 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-07 22:17:02 +00:00
bojar
a41a4e95d6 now expects 3 numbers on [generation-file] lines before the pathname
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@538 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-07 21:55:06 +00:00
eherbst
0d91864621 adding scripts to extract POSs from LOPAR output and to extract arbitrary sets of factors from a corpus
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@530 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-07 17:35:16 +00:00
eherbst
8420ecf516 added statistical testing, both to compare different outputs and to get a confidence measure for a single output
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@529 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-07 17:22:39 +00:00
bojar
2d7cf749a6 Allowing scores in 'scientific' float format from moses.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@514 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-05 00:53:48 +00:00
bojar
10a0e23801 checking in reduce_combine
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@510 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-05 00:11:57 +00:00
bojar
a5c122dfc8 added mert to list of released files, make rules to release moses (personally or publicly)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@505 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-04 22:25:38 +00:00
bojar
12f50a5f26 Added labelling of scores in nbestlist and fixed mert to understand that.
Before release, these have to be checked:
- train-factored-phrase-model.perl (the whole process)
- mert on newly generated moses.ini with 2 weights for generation


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@492 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-04 04:45:48 +00:00
redpony
7d0e0f5698 fix
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@491 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-03 21:10:18 +00:00
redpony
7b11b66b6d enable --parallel in tfpm.perl
add a script to build a generation table from a monolingual corpus.
add a script to post-process the german morpho tagger output


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@490 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-03 21:05:55 +00:00
bojar
18dac34fe2 checking in the current version of cmert we're using
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@489 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-03 20:00:00 +00:00
bojar
232727e0e4 removed the dependence on external lowercaser, lowercasing internally
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@488 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-03 18:07:10 +00:00
bojar
c2fdfae2c1 modifying Makefile and released-files so that clean-n-corpus is properly released
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@487 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-03 17:55:56 +00:00
bojar
4d49e12bc4 checking the latest version from /export/bin to cvs
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@486 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-03 17:53:13 +00:00
nicolabertoldi
8b459e004a check in qsub-wrpper.pl with temporary log dir in the working dir
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@472 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-02 20:51:06 +00:00
nicolabertoldi
fac860e205 check in moses-parallel.pl with strict requirement
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@470 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-02 20:19:08 +00:00
bojar
32e73c3785 yet another clarification of messages
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@467 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-02 20:06:04 +00:00
bojar
763bb72642 clarification
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@460 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-02 12:21:17 +00:00
nicolabertoldi
8568c3beda Check in moses-parallel.pl with several bugs corrected
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@458 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-01 22:24:33 +00:00
bojar
60f9301ab7 Fixed matching of lambdas. (Back to the hardwired order.)
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@457 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-01 22:14:12 +00:00
phkoehn
63a86828ba Added setting "distortion-limit=6" to moses.ini
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@455 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-01 21:11:17 +00:00
bojar
9304d71469 improved passing (and checking) of command-line options
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@446 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-01 15:18:49 +00:00
callison-burch
c0968b9041 Updated the script so that it correctly passes the qflags argument along to the qsub_wrapper script.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@445 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-01 14:29:08 +00:00
bojar
e59035efca Default to use only our team's queue.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@437 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-01 02:43:36 +00:00
bojar
5f3965de12 various tiny bugfixes
added basic testcases
moved qsub-wrapper to generic


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@434 1f5c12ca-751b-0410-a591-d2e778427230
2006-08-01 01:18:13 +00:00
nicolabertoldi
7910b65cdf Check in generic/moses-parallel.pl
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@431 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 23:01:02 +00:00
eherbst
54ab89deab seems this script does not have the same functionality as Ondrej's, and his are meant for training and this for analysis
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@430 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 22:14:08 +00:00
phkoehn
6f80f8c12a Speed-up of lexical translation table training, old code was crap
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@429 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 22:10:34 +00:00
eherbst
3b46c17ace believe Ondrej has a script w/same functionality; will investigate
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@428 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 22:07:34 +00:00
eherbst
5cce8336c0 add CGI-based tool for calculating and displaying various error measures
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@427 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 22:05:11 +00:00
bojar
75a5f9e935 clearer error message
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@425 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 21:33:54 +00:00
bojar
540aadea2b Allowing to optimize unknown lambdas, release methodology
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@421 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 21:01:07 +00:00
bojar
1c2cd47881 checking in the current version of clone_moses_model
working on a single scripts directory


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@416 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 20:13:25 +00:00
nicolabertoldi
ba76013a5c *** empty log message ***
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@411 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 18:26:44 +00:00
bojar
a325df6380 renamed pythonpath variable, correctly passing --jobs, checking for blank moses.ini
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@410 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 17:00:22 +00:00
bojar
51ad454a39 checking in this useful script
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@409 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 16:53:40 +00:00
bojar
32853150fc added a placeholder
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@408 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 16:39:33 +00:00
bojar
57bcad0c5f the cleanup of mert-moses seems to be finished
added first simple 'make release' goal


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@405 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-31 14:17:43 +00:00
bojar
54c6554d09 Removed the 'run-moses' functionality, so that the script is now usable by various variants of moses. (parallel and non parallel, mainly, but also by mert-moses.pl and others).
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@378 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-29 01:47:33 +00:00
bojar
9f4178e36e Added
-rwxrwxr-x  1 pkoehn ws06osmt 5769 Jul 19 15:47 run-filtered-moses.perl
under a new name. Just for diffing purposes.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@377 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-29 01:45:14 +00:00
bojar
46ea34ab20 Merged the parallel and non-parallel copies of this script.
Changed the command line and added some options.
Added extensive checking of validity of input files and options.
Still not ready for deployment due to the following bugs:
- the generation of output moses.ini was not tested
- the --start-step option does not work (not critical)


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@376 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-29 01:41:52 +00:00
bojar
05b0a07892 Checking in the last version of
-rwxr-xr-x  1 nbertoldi ws06osmt 12430 Jul 28 00:01 mert-moses-parallel.perl-2006-07-27
Just for diff purposes.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@374 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-29 01:35:46 +00:00
bojar
6272fa6ecf Added the change in giza default options as done by ccb.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@373 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-29 01:18:58 +00:00
bojar
9061c682eb Checking in the version:
-rwxrwxr-x  29 obojar ws06osmt 51861 Jul 24 18:15 train-factored-phrase-model.perl


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@372 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-29 01:13:30 +00:00
bojar
6188fa338d basis for a cleaner way of handling with our scripts
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@369 1f5c12ca-751b-0410-a591-d2e778427230
2006-07-28 23:59:33 +00:00