Commit Graph

13693 Commits

Author SHA1 Message Date
Rico Sennrich
1568afb737 on-the-fly unbinarization of internal tree structure (for translation models extracted from binarized treebanks) 2015-03-18 17:36:32 +00:00
Phil Williams
fc15e03ebe Replace truecase-egret.sh with more general tree-converter-wrapper.perl 2015-03-18 09:57:42 +00:00
Phil Williams
ac51e9f0a8 Always use "SyntaxInputWeight0" as name of SyntaxInputWeight feature 2015-03-18 09:56:46 +00:00
Hieu Hoang
1ca4f42539 codelite 2015-03-17 18:08:56 +00:00
Hieu Hoang
63d8b390b4 Changes to RUleScope from private branch. More codelite projects 2015-03-17 11:50:33 +00:00
Hieu Hoang
25feb7e47b option to change the estimated score only, not actuall score 2015-03-17 10:25:34 +00:00
Hieu Hoang
e1a5c1e140 start CodeLite project files 2015-03-16 22:42:21 +00:00
Hieu Hoang
42cbebe550 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2015-03-16 18:16:15 +00:00
Hieu Hoang
767163c96e start CodeLite project files 2015-03-16 18:14:06 +00:00
Ulrich Germann
dcffbb5f4d Made LRModel::ReorderingType an enumerated type. 2015-03-16 00:24:11 +00:00
Ulrich Germann
085c88cc7b Eliminated sources of some compiler warnings (unused variables; signed/usigned comparisons). 2015-03-15 22:45:01 +00:00
Ulrich Germann
ad805c133b Instances of InputType (and derived classes) now know which TranslationTask (if any) created them.
This is a first step towards providing phrase tables etc. access to context information etc.
associated with specific translation tasks.
2015-03-15 20:38:31 +00:00
Ulrich Germann
16d8ef67f0 Merge branch 'master' of https://github.com/moses-smt/mosesdecoder 2015-03-15 13:34:08 +00:00
Ulrich Germann
2a66a55c85 Added document map (maps from sentences to document ids) to Bitext class.
Minor overhaul to the bias regime, which allows to specify bias by document
name (as provided in the document map) rather than by sentence in the static
parallel corpus.
2015-03-15 13:32:09 +00:00
mjdenkowski
0714521367 Meteor compatibility with batch MIRA 2015-03-13 17:41:53 -04:00
Phil Williams
0a8e5fb3bf EMS: fix TRAINING:use-syntax-input-weight-feature option 2015-03-13 17:18:56 +00:00
Hieu Hoang
ce8b0e0876 fix example for reusing tuned moses.ini file 2015-03-13 15:07:23 +00:00
Phil Williams
05872cf32f Add tree-converter-mosesxml.sh wrapper script 2015-03-12 22:27:43 +00:00
Phil Williams
4685474e9b parse-en-egret.perl: wrap tree in parentheses prior to conversion to XML 2015-03-12 09:49:28 +00:00
Ulrich Germann
bc91743820 Merge branch 'master' of https://github.com/moses-smt/mosesdecoder 2015-03-11 23:32:12 +00:00
Ulrich Germann
a49b76be3f Quick hack to make moses not stumble over double-dash parameter specifications. 2015-03-11 23:32:06 +00:00
Kenneth Heafield
54304fd473 Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
Conflicts:
	moses/ExportInterface.cpp
2015-03-11 17:43:48 -04:00
James Zhang
23704613de added a simple translation interface 2015-03-11 17:33:17 -04:00
Philipp Koehn
530d0f5a11 some more better defaults for recaser 2015-03-11 17:56:02 +00:00
Philipp Koehn
2ce45229f8 better default configuration for recaser 2015-03-11 17:52:30 +00:00
Philipp Koehn
1632c5f39d proper handling of specified configuration file 2015-03-11 16:49:20 +00:00
Matthias Huck
534a894c0b glue rules with stripped BitPar labels 2015-03-10 22:02:21 +00:00
Matthias Huck
01bed83cf9 GHKM extraction: option to strip non-terminal labels from BitPar syntactic parses right during extraction (i.e., remove any suffix starting with a hyphen from the label) 2015-03-10 21:25:32 +00:00
Hieu Hoang
2fe8bccd2b remove visual studio and xcode project files. No longer maintained 2015-03-10 16:19:13 +00:00
Hieu Hoang
1705e29212 Merge branch 'master' of github.com:moses-smt/mosesdecoder 2015-03-10 16:05:13 +00:00
Hieu Hoang
ee6948b168 eclipse 2015-03-10 16:04:30 +00:00
Phil Williams
c7cf33ee05 parse-en-egret.perl: use "ROOT" instead of "TOP" as label of root tree node
This is to match the label Egret assigns to the root vertices of forests.
2015-03-10 15:43:14 +00:00
Hieu Hoang
ad73919979 merge with private branch 2015-03-10 15:28:45 +00:00
Phil Williams
77faaaea6c Add truecase-egret.sh
This is currently just a wrapper for Travatar's tree-converter tool.
2015-03-10 14:36:28 +00:00
Phil Williams
f7b4d403e3 Add parse-en-egret.perl wrapper script. 2015-03-10 14:32:59 +00:00
Phil Williams
9e88f794e6 Add phrase-extract/postprocess-egret-forests
This performs some minor transformations to Egret forests: escaping of
Moses special characters; removal of "^g" suffixes from constituent labels;
and marking of slash/hyphen split points (using @ characters).
2015-03-10 13:51:30 +00:00
Phil Williams
9e2eb702dc EMS: add TRAINING:use-syntax-input-weight-feature option 2015-03-10 11:40:49 +00:00
Phil Williams
91abb69cdf train-model.perl: add -use-syntax-input-weight-feature option
Currently only used for forest input.
2015-03-10 11:39:14 +00:00
Phil Williams
e8a7163f0d Add SyntaxInputWeight feature function
Currently only used for forest input.
2015-03-10 11:07:04 +00:00
Ulrich Germann
137b07a486 Merge branch 'master' of https://github.com/moses-smt/mosesdecoder 2015-03-10 10:41:47 +00:00
Ulrich Germann
51824355f9 Sampling now keeps track of counts for hierarchical lexicalized reordering. 2015-03-10 10:41:41 +00:00
Phil Williams
7eba58b942 EMS: add TRAINING:dont-tune-glue-grammar option
Adds -dont-tune-glue-grammar to train-model.perl command during config file
generation step.  This is preferable to manually adding -dont-tune-glue-grammar
to TRAINING:training-options because changing its value won't trigger a re-run
of dependent steps that don't really need re-running (like word alignment).
2015-03-10 10:20:19 +00:00
Phil Williams
e79644540c train-model.perl: add -dont-tune-glue-grammar option 2015-03-10 09:53:12 +00:00
Phil Williams
fd3dcb7bb0 filter-model-given-input.pl: add -[no]StripXml and -SyntaxFilterCmd options
-noStripXml is required for tree and forest input in STSG-based models.

-SyntaxFilterCmd can be used to set the command for filtering rule tables in
syntax-based models.  The default is to use

    $SCRIPTS_ROOTDIR/../bin/filter-rule-table

The option -MinNonInitialRuleCount is deprecated.
2015-03-10 08:57:56 +00:00
Phil Williams
70bef90b36 train-model.perl: add -score-command option
This matches the existing -extract-command option.  Given the argument value
<name>, train-model.perl will use the score program in

  $SCRIPTS_ROOTDIR/../bin/<name>

The default value is "score".
2015-03-10 08:48:54 +00:00
Matthias Huck
25f5470216 GHKM: write target parts-of-speech as a factor 2015-03-09 21:54:03 +00:00
Matthias Huck
524ed4406e pragma once 2015-03-09 21:44:54 +00:00
Matthias Huck
559077f6f8 some moderate modifications in phrase-extract/score-main.cpp
(e.g., use Moses::Scan<>() rather than atof()/atoi())
2015-03-09 18:49:32 +00:00
Matthias Huck
973fd98052 conservative update of some old code in phrase-extract/consolidate-main.cpp 2015-03-09 18:47:28 +00:00
Matthias Huck
0c79e19ff9 consolidate properties: fixing bug from commit b08d3ed 2015-03-09 18:44:02 +00:00