mosesdecoder

mirror of https://github.com/moses-smt/mosesdecoder.git synced 2024-12-29 15:04:05 +03:00

History

hieuhoang1972 b88fad16f8 create valid html header, according to Tomas Hudik git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@4336 1f5c12ca-751b-0410-a591-d2e778427230		2011-10-12 10:18:36 +00:00
..
perllib	minor, and moved stuff around	2006-08-08 23:38:45 +00:00
smtgui	add svn id comments to start of file	2007-03-14 22:22:36 +00:00
bootstrap-hypothesis-difference-significance.pl	added support for arbitrary encodings via the $IO_ENCODING global variable on line 23; set to UTF8 by default	2010-11-29 09:04:44 +00:00
extract-target-trees.py	Add scripts/analysis/extract-target-trees.py	2011-09-19 09:08:24 +00:00
nontranslated_words.pl	add svn id comments to start of file	2007-03-14 22:22:36 +00:00
oov.pl	Merge branch 'master' into moses-svn	2010-04-21 14:48:32 +00:00
README	updating docs	2006-08-16 16:37:11 +00:00
sentence-by-sentence.pl	create valid html header, according to Tomas Hudik	2011-10-12 10:18:36 +00:00
sg2dot.perl	fixed regexes to read current -osg format	2010-02-03 14:35:21 +00:00
show-phrases-used.pl	just setting the executable bit	2010-01-29 19:49:37 +00:00
suspicious_tokenization.pl	list frequent mismatched tokenizations first	2010-02-03 16:37:08 +00:00
weight-scan-summarize.sh	Ondrej's little tools to examine weight settings	2011-07-08 00:11:10 +00:00
weight-scan.pl	Ondrej's little tools to examine weight settings	2011-07-08 00:11:10 +00:00

README

Put any scripts useful for human analysis of MT output here.

sentence-by-sentence.pl [EVH]: show comparison of sentences in reference translation(s)/system output(s)/(truth) in colorful format
-- show all sentences given, with non-matching words in the system output marked, BLEU scores given by sentence, and matching n-grams shown in a table
-- requires all input files be utf8-encoded (you can convert a file with `cat FILE | perl -n -e 'binmode(STDOUT, ":utf8"); print;' > FILE.utf8`)

show-phrases-used.pl [EVH]: draw colorful diagram of which source phrases map to which target phrases
-- requires the Perl GD module, which in turn requires that gd be installed and in LD_LIBRARY_PATH
-- show average length of source phrases used for each sentence and overall
-- command-line options -r for reference and -s for source; lone filenames are taken to be system outputs