mosesdecoder/scripts/analysis
2022-01-21 21:11:02 +09:00
..
perllib Fix a lot of lint, mostly trailing whitespace. 2015-05-17 20:04:04 +07:00
smtgui Add license notices to scripts. 2015-05-29 18:30:26 +07:00
bootstrap-hypothesis-difference-significance.pl Modify a comment on usage in the script 2022-01-21 21:11:02 +09:00
extract-target-trees.py extract-target-trees.py: support for new-style trace files 2015-08-14 16:53:24 +01:00
nontranslated_words.pl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
oov.pl Added a simple support for the factored systems. 2015-08-27 15:15:32 +02:00
README fix start weights in experiment.perl, add hypothesis queue for picking hope and fear translations, add variations to 1slack formulation 2012-06-01 01:49:42 +01:00
sentence-by-sentence.pl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
sg2dot.perl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
show-phrases-used.pl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
suspicious_tokenization.pl Add license notices to scripts. 2015-05-29 18:30:26 +07:00
weight-scan-summarize.sh Add license notices to scripts. 2015-05-29 18:30:26 +07:00
weight-scan.pl Add license notices to scripts. 2015-05-29 18:30:26 +07:00

Put any scripts useful for human analysis of MT output here.

sentence-by-sentence.pl [EVH]: show comparison of sentences in reference translation(s)/system output(s)/(truth) in colorful format
-- show all sentences given, with non-matching words in the system output marked, BLEU scores given by sentence, and matching n-grams shown in a table
-- requires all input files be utf8-encoded (you can convert a file with `cat FILE | perl -n -e 'binmode(STDOUT, ":utf8"); print;' > FILE.utf8`)

show-phrases-used.pl [EVH]: draw colorful diagram of which source phrases map to which target phrases
-- requires the Perl GD module, which in turn requires that gd be installed and in LD_LIBRARY_PATH
-- show average length of source phrases used for each sentence and overall
-- command-line options -r for reference and -s for source; lone filenames are taken to be system outputs