mosesdecoder/scripts/analysis
Barry Haddow 2b4e61d826 Merge branch 'trunk' into miramerge
Compiles, not tested.

Conflicts:
	Jamroot
	OnDiskPt/PhraseNode.h
	OnDiskPt/TargetPhrase.cpp
	OnDiskPt/TargetPhrase.h
	OnDiskPt/TargetPhraseCollection.cpp
	mert/BleuScorer.cpp
	mert/Data.cpp
	mert/FeatureData.cpp
	moses-chart-cmd/src/Main.cpp
	moses/src/AlignmentInfo.h
	moses/src/ChartManager.cpp
	moses/src/LM/Ken.cpp
	moses/src/LM/Ken.h
	moses/src/LMList.h
	moses/src/LexicalReordering.h
	moses/src/PhraseDictionaryTree.h
	moses/src/ScoreIndexManager.h
	moses/src/StaticData.h
	moses/src/TargetPhrase.h
	moses/src/Word.cpp
	scripts/ems/experiment.meta
	scripts/ems/experiment.perl
	scripts/training/train-model.perl
2012-07-17 13:36:50 +01:00
..
perllib fix start weights in experiment.perl, add hypothesis queue for picking hope and fear translations, add variations to 1slack formulation 2012-06-01 01:49:42 +01:00
smtgui revert mode changes 2012-07-04 12:25:21 +01:00
bootstrap-hypothesis-difference-significance.pl added support for arbitrary encodings via the $IO_ENCODING global variable on line 23; set to UTF8 by default 2010-11-29 09:04:44 +00:00
extract-target-trees.py Add scripts/analysis/extract-target-trees.py 2011-09-19 09:08:24 +00:00
nontranslated_words.pl add svn id comments to start of file 2007-03-14 22:22:36 +00:00
oov.pl Merge branch 'master' into moses-svn 2010-04-21 14:48:32 +00:00
README fix start weights in experiment.perl, add hypothesis queue for picking hope and fear translations, add variations to 1slack formulation 2012-06-01 01:49:42 +01:00
sentence-by-sentence.pl create valid html header, according to Tomas Hudik 2011-10-12 10:18:36 +00:00
sg2dot.perl fixed regexes to read current -osg format 2010-02-03 14:35:21 +00:00
show-phrases-used.pl just setting the executable bit 2010-01-29 19:49:37 +00:00
suspicious_tokenization.pl list frequent mismatched tokenizations first 2010-02-03 16:37:08 +00:00
weight-scan-summarize.sh Ondrej's little tools to examine weight settings 2011-07-08 00:11:10 +00:00
weight-scan.pl Change Bin to RealBin. Thanks to Tom Hoar 2012-06-26 11:57:23 -04:00

Put any scripts useful for human analysis of MT output here.

sentence-by-sentence.pl [EVH]: show comparison of sentences in reference translation(s)/system output(s)/(truth) in colorful format
-- show all sentences given, with non-matching words in the system output marked, BLEU scores given by sentence, and matching n-grams shown in a table
-- requires all input files be utf8-encoded (you can convert a file with `cat FILE | perl -n -e 'binmode(STDOUT, ":utf8"); print;' > FILE.utf8`)

show-phrases-used.pl [EVH]: draw colorful diagram of which source phrases map to which target phrases
-- requires the Perl GD module, which in turn requires that gd be installed and in LD_LIBRARY_PATH
-- show average length of source phrases used for each sentence and overall
-- command-line options -r for reference and -s for source; lone filenames are taken to be system outputs