Commit Graph

1347 Commits

Author SHA1 Message Date
Barry Haddow
16ea68f55f Fix bug in mml scoring
Line length calculation was out of step with LM scoring.
2012-12-10 15:54:24 +00:00
phikoehn
ed2d191821 allow specification of end point for experiment.perl 2012-12-10 05:56:51 +00:00
phikoehn
ccf9e13d8e bug fix with multicore parallelizer 2012-12-09 22:27:02 +00:00
phikoehn
466b502ae0 minor bug fixes with MML 2012-12-09 20:31:20 +00:00
Hieu Hoang
55e5af4785 add my workstation to ems list 2012-12-07 19:24:58 +00:00
phikoehn
ab2effb6fe train MML in-/out-of-domain language models with same vocabulary 2012-12-01 13:46:59 +00:00
Hieu Hoang
5fd9cbb529 delete reference to numpy. Doesn't need it 2012-11-30 10:28:51 +00:00
phikoehn
338b7656a6 ooops 2012-11-30 07:36:59 +00:00
phikoehn
84cb04c05a fixes and extensions to modified Moore-Lewis filtering, now works with domain features 2012-11-30 07:28:31 +00:00
phikoehn
1f7ee0e6c5 change of settings for sigtest filtering 2012-11-29 23:44:10 +00:00
Barry Haddow
f0e12912e7 mml-score.py. Support for combining with domain features. 2012-11-27 15:58:55 +00:00
phikoehn
b5d08745a5 extensions to modified moore-lewis filtering, bug fixes 2012-11-24 20:13:14 +00:00
phikoehn
ea610a0558 added modified-moore-lewis from Barry Haddow into EMS 2012-11-24 12:43:13 +00:00
phikoehn
d4cebb008a added ems support for sigtest-filter 2012-11-23 17:35:13 +00:00
Hieu Hoang
487822ed14 don't write to stdout 2012-11-22 15:08:00 +00:00
phikoehn
c2a96fcc33 adjust to irstlm changes 2012-11-20 17:19:17 +00:00
phikoehn
5cd614ecd8 adjust to irstlm changes 2012-11-20 17:18:57 +00:00
Hieu Hoang
7d6d91a2e8 move zmert to contrib folder 2012-11-20 16:42:49 +00:00
Barry Haddow
2a88fd0730 Support for compact phrase table in EMS
It should be sufficient to add a line like
ttable-binarizer = "/home/bhaddow/moses/dist/bin/processPhraseTableMin"
to your EMS config, and everything else will be taken care of. You can
add other arguments to the processPhraseTableMin, for example for
threading, by putting them in the quotes.

Note that this is not fully tested, since there are currently some
issues with the compact phrase table introduced by the sparse feature
merge.
2012-11-16 15:07:07 +00:00
Barry Haddow
a90e1861c0 Alignments on by default for phrase-based 2012-11-15 12:35:43 +00:00
Barry Haddow
5e3726eb90 Remove -use-alignment-info 2012-11-15 09:42:58 +00:00
Hieu Hoang
ea24000a84 c++ version of fuzzy match works 2012-11-12 13:58:39 +00:00
Hieu Hoang
ca3a4b2598 c++ version of fuzzy match works 2012-11-12 13:20:14 +00:00
Hieu Hoang
fc3a84a172 it's all gone pete tong 2012-11-11 19:42:50 +00:00
Hieu Hoang
b75e26c686 fuzzy match bug. Everything matches except alignments 2012-11-10 18:38:49 +00:00
Hieu Hoang
27a6cf2ebc fuzzy match bug. Still not bug-free 2012-11-10 17:09:07 +00:00
Hieu Hoang
baf210f077 fizzy match bug 2012-11-10 14:57:27 +00:00
Barry Haddow
095b307cfc Make sure alignment info is not on for hiero 2012-11-08 18:16:53 +00:00
Barry Haddow
12786dd58f Alignments on by default.
Use TRAINING:include-word-alignment-in-rules = no
to turn them off.
2012-11-08 17:52:10 +00:00
Mark Fishel
01eb79cf6f fixed berkeley parser wrapper's broken output in case of in-text parentheses 2012-11-07 11:18:36 +01:00
Barry Haddow
01c4de24b7 don't delete moses ini specified in config 2012-11-06 10:59:15 +00:00
Hieu Hoang
d13c761505 perl to cpp 2012-10-25 19:11:03 +01:00
Hieu Hoang
9aeb368af2 perl to cpp 2012-10-25 18:06:07 +01:00
Hieu Hoang
1770b19c11 perl to cpp 2012-10-25 17:58:30 +01:00
Hieu Hoang
3bb4c3994d perl to cpp 2012-10-25 16:30:16 +01:00
Hieu Hoang
33063e8d4b perl to cpp 2012-10-25 16:20:00 +01:00
Hieu Hoang
6657c5b2b3 perl to cpp 2012-10-24 18:54:41 +01:00
Hieu Hoang
ae97ddc9fe perl to cpp 2012-10-24 17:53:11 +01:00
Hieu Hoang
d8ecb47d85 perl to cpp 2012-10-24 10:17:43 +01:00
Hieu Hoang
6baa3d7e95 converting create_xml.perl to create_xml.cpp 2012-10-23 18:21:54 +01:00
Hieu Hoang
6f7f2cf332 take out debug arg from perl script 2012-10-23 14:53:53 +01:00
Hieu Hoang
d241fa4e47 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2012-10-20 14:01:21 +01:00
Hieu Hoang
99d5e738aa use kenlm if sri specified 2012-10-20 14:01:11 +01:00
Hieu Hoang
ea48ab7845 perltidy 2012-10-19 18:38:46 +01:00
Barry Haddow
f9d0721145 correct format for word trans features 2012-10-19 09:07:18 +01:00
phikoehn
0dda804c46 sparse feature fixes 2012-10-18 03:09:49 +01:00
phikoehn
7d66a4f8d5 superminortiny fix 2012-10-18 02:52:30 +01:00
phikoehn
98dafc0301 Merge branch 'master' of git://github.com/moses-smt/mosesdecoder 2012-10-18 02:20:45 +01:00
phikoehn
f19d53dac9 minor fixes 2012-10-18 02:20:38 +01:00
Barry Haddow
365e680115 Merge remote-tracking branch 'origin/master' into miramerge
NB Untested

Conflicts:
	Jamroot
	moses-chart-cmd/src/Main.cpp
	moses/src/ChartManager.cpp
	moses/src/RuleTable/LoaderStandard.cpp
	moses/src/RuleTable/PhraseDictionaryALSuffixArray.cpp
	moses/src/Word.cpp
2012-10-15 21:35:56 +01:00
Hieu Hoang
f3ec76ac56 minor change to calling irst training 2012-10-14 19:51:46 +01:00
Barry Haddow
61ae24aa5d Merge remote-tracking branch 'origin/master' into miramerge
Conflicts:
	moses/src/PhraseDictionary.cpp
	moses/src/TargetPhrase.cpp
	moses/src/TargetPhrase.h
2012-10-14 14:18:03 +01:00
Lane Schwartz
c541c77b2f Merge branch 'master' of www:/repos/git/Decoders/mosesdecoder 2012-10-11 10:15:00 -04:00
Lane Schwartz
11679849db In verbose mode, experiment.perl should print the full qsub command
prior to actually running it.
2012-10-10 13:25:58 -04:00
Lane Schwartz
0904531749 Make experiment.perl use qsub-settings from GENERAL section
if no qsub-settings are defined for the specific section being run.
2012-10-10 13:25:09 -04:00
Ales Tamchyna
608f6ba607 handle binarized phrase table in clone_moses_model.pl 2012-10-09 16:05:12 +02:00
Barry Haddow
848aafb644 Merge remote branch 'github/master' into miramerge
Conflicts:
	moses/src/AlignmentInfo.cpp
	moses/src/AlignmentInfo.h
	moses/src/ChartHypothesis.cpp
	moses/src/ChartTrellisNode.cpp
	moses/src/LM/Implementation.cpp
	moses/src/LM/Ken.cpp
	moses/src/TargetPhrase.cpp
	moses/src/TargetPhrase.h
2012-10-08 17:54:59 +01:00
Phil Williams
b2b9751227 parse-de-bitpar.perl: fix special char handling
Unescape special characters in input to BitPar and then re-escape
in output.
2012-10-06 16:27:33 +01:00
phikoehn
04544f8bfc better error message when reference file not found 2012-10-04 23:22:19 +01:00
Phil Williams
289a9ea54f experiment.meta: update pcfg-extract and pcfg-score
EMS now looks for binaries in $moses-bin-dir instead of their old
location in $moses-script-dir.
2012-10-03 19:57:51 +01:00
Eva Hasler
e7e4dbd405 merge remaining changes to mira, word pair features, phrase pair features 2012-10-03 18:53:55 +01:00
Lane Schwartz
82ab7c1507 Force ems to pass -S /bin/bash to qsub.
For reasons that defy comprehension, when qsub runs scripts,
it blatantly ignores the shebang line that specifies a shell to use.

Instead, SGE has its own config variable that defines
what shell to use when running scripts via qsub.

The -S /bin/bash option to qsub forces SGE
to launch your script using bash.

The scripts created by experiment.perl
all assume they will be run with bash,
so it is incumbent upon experiment.perl
to ensure that SGE uses bash to run them.
2012-10-02 09:24:20 -04:00
Eva Hasler
ebbf0d028c changes to mira, adding jackknife to experiment.perl 2012-10-01 20:36:52 +01:00
Lane Schwartz
ca6a751f6e Merge branch 'master' of www:/repos/git/Decoders/mosesdecoder 2012-09-28 15:30:24 -04:00
Lane Schwartz
7b042edc6c Send stderr to /dev/null when looking for pawd.
This cleans up the logs a bit for those of us who don't have pawd.
Otherwise, messages like the following show up in the logs:

/usr/bin/which: no pawd in ...
2012-09-28 14:55:09 -04:00
Lane Schwartz
a323c8daf7 Send stderr to /dev/null when looking for pawd.
This cleans up the logs a bit for those of us who don't have pawd.
Otherwise, messages like the following show up in the logs:

/usr/bin/which: no pawd in ...

bash: pawd: command not found
2012-09-28 14:37:53 -04:00
phikoehn
8bb49c9053 chart decoder search graph viz and other fixes to web interface of ems 2012-09-26 22:57:15 +01:00
Barry Haddow
0a950ee9f4 Merge remote branch 'github/master' into miramerge
Compiles, but not tested. Had to disable relent filter. Strangely, it seems to contain the
whole of moses-cmd.

Conflicts:
	Jamroot
	OnDiskPt/TargetPhrase.cpp
	moses-cmd/src/Main.cpp
	moses/src/AlignmentInfo.cpp
	moses/src/AlignmentInfo.h
	moses/src/ChartTranslationOptionCollection.cpp
	moses/src/ChartTranslationOptionCollection.h
	moses/src/GenerationDictionary.cpp
	moses/src/Jamfile
	moses/src/Parameter.cpp
	moses/src/PhraseDictionary.cpp
	moses/src/StaticData.cpp
	moses/src/StaticData.h
	moses/src/TargetPhrase.h
	moses/src/TranslationSystem.cpp
	moses/src/TranslationSystem.h
	moses/src/Word.cpp
	phrase-extract/score.cpp
	regression-testing/Jamfile
	scripts/ems/experiment.meta
	scripts/ems/experiment.perl
	scripts/training/train-model.perl
2012-09-26 22:49:33 +01:00
Hieu Hoang
24a6425b23 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2012-09-25 10:57:26 +01:00
Hieu Hoang
b761bd3237 exit 0 on success. /Henry Hu 2012-09-25 10:57:01 +01:00
phikoehn
a84fbcb80a bug fix for using domain feature in multi-process extract 2012-09-24 11:14:36 +01:00
Philipp Koehn
4749e1b990 allow mert use weights in config file for first decoder run 2012-09-24 11:11:40 +01:00
Barry Haddow
8051981747 Fix so that it works without mira 2012-09-21 09:24:57 +01:00
Barry Haddow
37494dd673 fix phikoehn's merge conflict 2012-09-20 10:28:03 +01:00
Barry Haddow
58b6697dd3 Fix compile bug from merge 2012-09-13 20:00:20 +01:00
Barry Haddow
f16965b841 Fix compile errors. Fix error when no interpolated lm. 2012-09-13 19:50:18 +01:00
Eva Hasler
e6c73ec611 remove hardwired path 2012-09-11 17:29:15 +01:00
Rico Sennrich
4e2fc82854 new training option -write-lexical-counts
(creates additional files lex.counts.e2f and lex.counts.f2e)
2012-09-06 11:48:54 +02:00
Jonathan Clark
f5137c1a48 Accept compact phrase table and reordering models 2012-09-04 10:11:56 -04:00
Hieu Hoang
30e5b0575b merge conflict 2012-09-03 19:12:00 +01:00
phikoehn
5d9859ba0e merge issues 2012-09-03 07:27:41 +01:00
phikoehn
19ef785146 bug fixes 2012-09-03 07:24:31 +01:00
phikoehn
0e783dc529 bug fix to enable pruned search graph output by default 2012-09-03 07:23:32 +01:00
Hieu Hoang
c639cdbb38 binary hiero reordering feature. Integrated into train-model.perl and experiment.perl. In the 2nd to last position in phrase table, just in front of 2.718 2012-08-28 17:01:08 +01:00
Hieu Hoang
4519b2b14c roll back time command. Doesn't run on Mac OSX 2012-08-28 16:31:02 +01:00
Nadi Tomeh
e90744c21c Merge branch 'master' of ssh://github.com/moses-smt/mosesdecoder 2012-08-25 23:55:15 +02:00
Nadi Tomeh
02ddf45671 Remove a test from the function get_vocabulary which prevented vocabulary files from being generated if the option -no-lexical-weighting was specified. 2012-08-25 22:46:44 +02:00
Hieu Hoang
69fc00faf9 singleton feature in phrase table. Like similar feature in Adam's suffix array, as implemented in cdec 2012-08-24 00:54:05 +01:00
Hieu Hoang
20341c091f remove escape-only tokenizer. A script already available 2012-08-22 13:04:10 +01:00
Hieu Hoang
74e59b9b25 tokenizer and detok that only does escaping 2012-08-22 12:32:19 +01:00
Jonathan Clark
680e0ff987 Merge branch 'master' of https://github.com/moses-smt/mosesdecoder 2012-08-21 15:27:49 -04:00
Jonathan Clark
3790c67750 Add a bit of documentation on how flags are being formed for the lexical-reordering-score program. 2012-08-21 15:27:18 -04:00
phikoehn
4a1a995878 a lot of changes 2012-08-18 23:48:26 +01:00
phikoehn
366ab93f8a a lot of changes 2012-08-18 23:47:05 +01:00
Karel Bílek
87620cd52b Adding time information to the output files 2012-08-18 15:37:53 +03:00
Lane Schwartz
6ed9b31c69 Merge branch 'master' of www:/repos/git/Decoders/mosesdecoder 2012-08-15 12:53:14 -04:00
Lane Schwartz
1883090a3d Moved code for calculating lexical translation probabilities
into a new perl module called LexicalTranslationModel.pm.

This commit moves the subroutine get_lexical (and its helper subroutines)
from train-model.perl into LexicalTranslationModel.pm. This new perl module
is now imported at the top of train-model.perl.

This change should not affect users of train-model.perl at all.

Doing this allows for the implementation of a stand-alone script
which can be used to create lexical translation model files directly,
given a word-aligned parallel corpus. This is often useful to do, and
should now be easier to do. The new script is get-lexical.perl.

Usage:
scripts/training/get-lexical.perl source target alignments output_prefix

Results:
output_prefix.f2e
output_prefix.e2f
2012-08-15 10:45:36 -04:00
Hieu Hoang
e2b4c3074d Merge branch 'master' of github.com:moses-smt/mosesdecoder 2012-08-10 12:08:31 +01:00
Hieu Hoang
7757a3c56b Error from sort 2012-08-10 11:18:19 +01:00
Lane Schwartz
6f6ef3eb5d Merge branch 'master' of www:/repos/git/Decoders/mosesdecoder 2012-08-09 11:18:15 -04:00