mosesdecoder

mirror of https://github.com/moses-smt/mosesdecoder.git synced 2024-09-19 23:27:46 +03:00

Author	SHA1	Message	Date
Barry Haddow	ad8114ddb0	capitalisation	2015-06-15 16:23:12 +01:00
XapaJIaMnu	166bf7365f	Forgot to update the weight config path	2015-06-12 16:56:36 +01:00
XapaJIaMnu	ffd3f2bb6e	Added basic BilingualNPLM support to EMS and an example config.	2015-06-12 16:21:24 +01:00
Jeroen Vermeulen	85c23ed7dc	Fix some JS lint.	2015-06-02 18:05:12 +07:00
Jeroen Vermeulen	0981d23705	Lint-fixing binge.	2015-06-02 16:02:39 +07:00
Jeroen Vermeulen	ef028446f3	Add license notices to scripts. This is not pleasant to read (and much, much less pleasant to write!) but sort of necessary in an open project. Right now it's quite hard to figure out what is licensed how, which doesn't matter much to most people but can suddenly become very important when people want to know what they're being allowed to do. I kept the notices as short as I could. As far as I could see, everything without a clear license notice is LGPL v2.1 or later.	2015-05-29 18:30:26 +07:00
Rico Sennrich	f6f56d11af	ems: parse-relax comes last in train; do same for dev/test	2015-05-25 15:52:07 +01:00
Rico Sennrich	98ff2382d0	duplication of existing functionality	2015-05-20 17:35:38 +01:00
Rico Sennrich	6aac7ded9a	EMS: more flexible way to concatenate LM training data. the implementation allows the user to specify which corpora to combine, and to have multiple LMs on the same data.	2015-05-20 17:20:02 +01:00
Rico Sennrich	8ca6764c7d	ems: allow LMs with user-specified training commands and moses.ini config entries intended for neural LMs, syntactic LMs, and the like. currently doesn't play nice with INTERPOLATED-LM.	2015-05-18 19:07:37 +01:00
Rico Sennrich	fb06a2325e	fix broken ems with interpolated lm disabled	2015-05-18 17:26:09 +01:00
Rico Sennrich	f85dd85f6b	ignore-unless magic	2015-05-18 16:17:33 +01:00
Rico Sennrich	59376f500b	still confused about pass-unless vs. ignore-unless	2015-05-18 14:40:56 +01:00
Rico Sennrich	45a97f9016	EMS: disable concatenated LM by default	2015-05-18 14:10:29 +01:00
Rico Sennrich	27fd45d088	ems: training LM on concatenation of all LM training corpora	2015-05-18 12:18:49 +01:00
Jeroen Vermeulen	e2a632a2b8	JavaScript lint.	2015-05-17 21:36:07 +07:00
Jeroen Vermeulen	5d0bbb6a45	Fix some JavaScript lint. Still a lot left.	2015-05-17 21:24:04 +07:00
Jeroen Vermeulen	a25193cc5d	Fix a lot of lint, mostly trailing whitespace. This is lint reported by the new lint-checking functionality in beautify.py. (We can change to a different lint checker if we have a better one, but it would probably still flag these same problems.) Lint checking can help a lot, but only if we get the lint under control.	2015-05-17 20:04:04 +07:00
Jeroen Vermeulen	61162dd242	Fix more Python lint. Most of the complaints fixed here were from Pocketlint, but many were also from Syntastic the vim plugin.	2015-05-16 17:26:56 +07:00
Hieu Hoang	abfc0671a3	osm tweaks and morfessor wrapper	2015-05-12 20:19:39 +04:00
Hieu Hoang	8bb18b9ff0	add no-splitter-training argument. Splitter to be used by mada	2015-05-11 15:26:50 +04:00
Barry Haddow	85c1af4d72	Merge branch 'master' of github.com:moses-smt/mosesdecoder	2015-05-08 09:16:55 +01:00
Barry Haddow	f403f5e478	mmsapt doesn't require feature weights on first tuning iteration	2015-05-08 09:16:51 +01:00
Hieu Hoang	2acb590394	output bleu for multi-bleu hack	2015-05-05 17:54:35 +04:00
Hieu Hoang	d006c6ef8c	don't output remaining args twice	2015-05-05 12:15:08 +04:00
Hieu Hoang	8f272e04a9	output debugging messages to stderr, not stdout	2015-05-05 12:01:21 +04:00
Hieu Hoang	d456d9229e	add multi-bleu-detok. Like multi-bleu scoring but will detokenize/post-process before scoring	2015-05-03 14:07:12 +04:00
Philipp Koehn	a4a7c14593	allow breaking up training data for fast align (to avoid memory blowups for very large corpora)	2015-05-01 17:47:08 -04:00
Philipp Koehn	b369699661	various small changes, mostly related to better compliance with grid engine	2015-05-01 17:44:18 -04:00
Rico Sennrich	e98a2fc980	fix interpolation for LM with parser in pre-processing	2015-04-30 15:46:33 +01:00
Hieu Hoang	4b47e1148c	use ignore-unless /Philipp Koehn	2015-04-22 23:02:57 +04:00
Hieu Hoang	40933b4a78	hack to allow target side of tokenized parallel corpus to be used for LM	2015-04-22 19:01:12 +04:00
Hieu Hoang	ab01d30687	make sure GetOptions doesn consume -T by confusing it with --text	2015-04-21 17:53:46 +04:00
Rico Sennrich	15d3c3f259	be more tolerant about xml input	2015-04-21 14:04:25 +01:00
Rico Sennrich	5a3d5b6bdd	EMS: LM:mock-parse can be actual parser	2015-04-21 10:21:24 +01:00
Hieu Hoang	1b9dc6cfae	more butinah tweaks	2015-04-19 11:50:50 +04:00
Hieu Hoang	637e8a17e8	add pre tokenization cleaning script. In case training has bad, overlying long lines which blows up some taggers/segmenters, eg. mada	2015-04-19 11:21:07 +04:00
Hieu Hoang	6162223690	add use warnings to all perl scripts	2015-04-13 20:42:33 +04:00
Dingyuan Wang	4aba64ed53	Merge pull request #106 from gumblex/master Fix some problems in EMS	2015-04-11 09:26:25 +08:00
Hieu Hoang	02185a85fb	store temp run files in current directory, not /tmp	2015-04-05 17:02:48 +04:00
Hieu Hoang	93ad52d2f9	leave in runPath for debugging	2015-04-05 16:49:12 +04:00
Hieu Hoang	7ffdddef13	script to submit ems job to grid engine as 1 job. Hardcoded for NYUAD at the mo	2015-04-05 16:44:24 +04:00
Dingyuan Wang	aea07b0a19	Fix some problems in EMS: * remove absolute links * fix coverage bar highlighting * change Base64 library to support UTF-8	2015-04-03 23:47:25 +08:00
Hieu Hoang	b2f9ba2b64	revert last commit to add MASTER_PATH. Not needed	2015-04-02 19:29:42 +04:00
Hieu Hoang	27b36e0c96	pass in PATH variable from master node. When you're running of a grid but really just qsubbing everything to 1 slave node	2015-04-02 19:15:21 +04:00
Hieu Hoang	2d1da3219d	consistently use 'env perl' command for environments where the 1st perl in PATH isn't the default perl. Which is kinda stupid	2015-04-02 17:38:56 +04:00
Hieu Hoang	e22d275c32	don't ignore lowercasing of factored LM. Must be consistent with pt	2015-04-01 23:25:57 +04:00
Phil Williams	6ce3060dd8	lmplz-wrapper.perl: use Getopt::Long's "pass_through" option This avoids the need to duplicate all of lmplz's options in the wrapper and it prevents --prune 0 0 1 from being truncated to --prune 0 if the user forgets to quote the arguments.	2015-03-30 10:18:51 +01:00
Rico Sennrich	3a673fc8dc	EMS: support for syntactic metrics for MERT/MIRA - add "-n-best-trees" to TUNING:decoder-settings - add "mock-output-parser-references = $output-parser" to GENERAL (and define output-parser) - TUNING:tuning-settings should include the metric you want to optimize (e.g. "-batch-mira-args='--sctype BLEU,HWCM'")	2015-03-20 17:15:33 +00:00
Phil Williams	fc15e03ebe	Replace truecase-egret.sh with more general tree-converter-wrapper.perl	2015-03-18 09:57:42 +00:00

1 2 3 4 5 ...

451 Commits