mosesdecoder

mirror of https://github.com/moses-smt/mosesdecoder.git synced 2024-09-19 23:27:46 +03:00

Author	SHA1	Message	Date
Rico Sennrich	6aac7ded9a	EMS: more flexible way to concatenate LM training data. the implementation allows the user to specify which corpora to combine, and to have multiple LMs on the same data.	2015-05-20 17:20:02 +01:00
Rico Sennrich	8ca6764c7d	ems: allow LMs with user-specified training commands and moses.ini config entries intended for neural LMs, syntactic LMs, and the like. currently doesn't play nice with INTERPOLATED-LM.	2015-05-18 19:07:37 +01:00
Rico Sennrich	fb06a2325e	fix broken ems with interpolated lm disabled	2015-05-18 17:26:09 +01:00
Rico Sennrich	f85dd85f6b	ignore-unless magic	2015-05-18 16:17:33 +01:00
Rico Sennrich	59376f500b	still confused about pass-unless vs. ignore-unless	2015-05-18 14:40:56 +01:00
Rico Sennrich	45a97f9016	EMS: disable concatenated LM by default	2015-05-18 14:10:29 +01:00
Rico Sennrich	27fd45d088	ems: training LM on concatenation of all LM training corpora	2015-05-18 12:18:49 +01:00
Jeroen Vermeulen	e2a632a2b8	JavaScript lint.	2015-05-17 21:36:07 +07:00
Jeroen Vermeulen	5d0bbb6a45	Fix some JavaScript lint. Still a lot left.	2015-05-17 21:24:04 +07:00
Jeroen Vermeulen	a25193cc5d	Fix a lot of lint, mostly trailing whitespace. This is lint reported by the new lint-checking functionality in beautify.py. (We can change to a different lint checker if we have a better one, but it would probably still flag these same problems.) Lint checking can help a lot, but only if we get the lint under control.	2015-05-17 20:04:04 +07:00
Jeroen Vermeulen	61162dd242	Fix more Python lint. Most of the complaints fixed here were from Pocketlint, but many were also from Syntastic the vim plugin.	2015-05-16 17:26:56 +07:00
Hieu Hoang	abfc0671a3	osm tweaks and morfessor wrapper	2015-05-12 20:19:39 +04:00
Hieu Hoang	8bb18b9ff0	add no-splitter-training argument. Splitter to be used by mada	2015-05-11 15:26:50 +04:00
Barry Haddow	85c1af4d72	Merge branch 'master' of github.com:moses-smt/mosesdecoder	2015-05-08 09:16:55 +01:00
Barry Haddow	f403f5e478	mmsapt doesn't require feature weights on first tuning iteration	2015-05-08 09:16:51 +01:00
Hieu Hoang	2acb590394	output bleu for multi-bleu hack	2015-05-05 17:54:35 +04:00
Hieu Hoang	d006c6ef8c	don't output remaining args twice	2015-05-05 12:15:08 +04:00
Hieu Hoang	8f272e04a9	output debugging messages to stderr, not stdout	2015-05-05 12:01:21 +04:00
Hieu Hoang	d456d9229e	add multi-bleu-detok. Like multi-bleu scoring but will detokenize/post-process before scoring	2015-05-03 14:07:12 +04:00
Philipp Koehn	a4a7c14593	allow breaking up training data for fast align (to avoid memory blowups for very large corpora)	2015-05-01 17:47:08 -04:00
Philipp Koehn	b369699661	various small changes, mostly related to better compliance with grid engine	2015-05-01 17:44:18 -04:00
Rico Sennrich	e98a2fc980	fix interpolation for LM with parser in pre-processing	2015-04-30 15:46:33 +01:00
Hieu Hoang	4b47e1148c	use ignore-unless /Philipp Koehn	2015-04-22 23:02:57 +04:00
Hieu Hoang	40933b4a78	hack to allow target side of tokenized parallel corpus to be used for LM	2015-04-22 19:01:12 +04:00
Hieu Hoang	ab01d30687	make sure GetOptions doesn consume -T by confusing it with --text	2015-04-21 17:53:46 +04:00
Rico Sennrich	15d3c3f259	be more tolerant about xml input	2015-04-21 14:04:25 +01:00
Rico Sennrich	5a3d5b6bdd	EMS: LM:mock-parse can be actual parser	2015-04-21 10:21:24 +01:00
Hieu Hoang	1b9dc6cfae	more butinah tweaks	2015-04-19 11:50:50 +04:00
Hieu Hoang	637e8a17e8	add pre tokenization cleaning script. In case training has bad, overlying long lines which blows up some taggers/segmenters, eg. mada	2015-04-19 11:21:07 +04:00
Hieu Hoang	6162223690	add use warnings to all perl scripts	2015-04-13 20:42:33 +04:00
Dingyuan Wang	4aba64ed53	Merge pull request #106 from gumblex/master Fix some problems in EMS	2015-04-11 09:26:25 +08:00
Hieu Hoang	02185a85fb	store temp run files in current directory, not /tmp	2015-04-05 17:02:48 +04:00
Hieu Hoang	93ad52d2f9	leave in runPath for debugging	2015-04-05 16:49:12 +04:00
Hieu Hoang	7ffdddef13	script to submit ems job to grid engine as 1 job. Hardcoded for NYUAD at the mo	2015-04-05 16:44:24 +04:00
Dingyuan Wang	aea07b0a19	Fix some problems in EMS: * remove absolute links * fix coverage bar highlighting * change Base64 library to support UTF-8	2015-04-03 23:47:25 +08:00
Hieu Hoang	b2f9ba2b64	revert last commit to add MASTER_PATH. Not needed	2015-04-02 19:29:42 +04:00
Hieu Hoang	27b36e0c96	pass in PATH variable from master node. When you're running of a grid but really just qsubbing everything to 1 slave node	2015-04-02 19:15:21 +04:00
Hieu Hoang	2d1da3219d	consistently use 'env perl' command for environments where the 1st perl in PATH isn't the default perl. Which is kinda stupid	2015-04-02 17:38:56 +04:00
Hieu Hoang	e22d275c32	don't ignore lowercasing of factored LM. Must be consistent with pt	2015-04-01 23:25:57 +04:00
Phil Williams	6ce3060dd8	lmplz-wrapper.perl: use Getopt::Long's "pass_through" option This avoids the need to duplicate all of lmplz's options in the wrapper and it prevents --prune 0 0 1 from being truncated to --prune 0 if the user forgets to quote the arguments.	2015-03-30 10:18:51 +01:00
Rico Sennrich	3a673fc8dc	EMS: support for syntactic metrics for MERT/MIRA - add "-n-best-trees" to TUNING:decoder-settings - add "mock-output-parser-references = $output-parser" to GENERAL (and define output-parser) - TUNING:tuning-settings should include the metric you want to optimize (e.g. "-batch-mira-args='--sctype BLEU,HWCM'")	2015-03-20 17:15:33 +00:00
Phil Williams	fc15e03ebe	Replace truecase-egret.sh with more general tree-converter-wrapper.perl	2015-03-18 09:57:42 +00:00
Phil Williams	0a8e5fb3bf	EMS: fix TRAINING:use-syntax-input-weight-feature option	2015-03-13 17:18:56 +00:00
Hieu Hoang	ce8b0e0876	fix example for reusing tuned moses.ini file	2015-03-13 15:07:23 +00:00
Philipp Koehn	530d0f5a11	some more better defaults for recaser	2015-03-11 17:56:02 +00:00
Philipp Koehn	2ce45229f8	better default configuration for recaser	2015-03-11 17:52:30 +00:00
Philipp Koehn	1632c5f39d	proper handling of specified configuration file	2015-03-11 16:49:20 +00:00
Matthias Huck	01bed83cf9	GHKM extraction: option to strip non-terminal labels from BitPar syntactic parses right during extraction (i.e., remove any suffix starting with a hyphen from the label)	2015-03-10 21:25:32 +00:00
Phil Williams	9e2eb702dc	EMS: add TRAINING:use-syntax-input-weight-feature option	2015-03-10 11:40:49 +00:00
Phil Williams	7eba58b942	EMS: add TRAINING:dont-tune-glue-grammar option Adds -dont-tune-glue-grammar to train-model.perl command during config file generation step. This is preferable to manually adding -dont-tune-glue-grammar to TRAINING:training-options because changing its value won't trigger a re-run of dependent steps that don't really need re-running (like word alignment).	2015-03-10 10:20:19 +00:00

1 2 3 4 5 ...

443 Commits