mosesdecoder

mirror of https://github.com/moses-smt/mosesdecoder.git synced 2024-12-27 05:55:02 +03:00

Author	SHA1	Message	Date
Nicola Bertoldi	16e4220f17	functions to handle with Document-Level Translation tags	2013-08-14 12:20:51 +02:00
Nicola Bertoldi	614d7a0376	beautify	2013-08-11 23:43:26 +02:00
Nicola Bertoldi	5868653bd6	beautify	2013-08-11 23:41:23 +02:00
Nicola Bertoldi	7411227305	clean up related to the PhrrasePenalty producer transform the PhrasePenalty basic feature functions into a FF like WordPenalty	2013-08-11 23:32:54 +02:00
phikoehn	e50fc722e9	bug fix alternative weight setting	2013-08-07 15:35:40 +01:00
phikoehn	67c3063574	Merge branch 'master' of ssh://github.com/moses-smt/mosesdecoder	2013-08-07 05:32:59 +01:00
phikoehn	ab4e3c63a6	enriched trace	2013-08-07 05:31:45 +01:00
Hieu Hoang	302eec8283	beautify	2013-08-05 12:11:59 +01:00
Kenneth Heafield	78cdf82de8	Log10/loge weight change for incremental. TODO: debug n-best list generation	2013-08-02 17:56:41 +01:00
Hieu Hoang	f234aa203f	number recognizer treats each word as atomic, replace all of the word or nothing at all. Recognizer is designed to be run after the text has been tokenized, not before	2013-08-01 16:55:11 +01:00
Rico Sennrich	b32366ab8c	fix future and total cost in multimodel(counts). (was broken since merge of branch weight-new in May)	2013-07-31 14:18:18 +02:00
Rico Sennrich	d0e2c43011	Merge branch 'master' of github.com:moses-smt/mosesdecoder	2013-07-30 17:18:32 +02:00
Rico Sennrich	a15bc05a33	rename multimodel weights in moses server (harmonization with the new config format)	2013-07-30 17:02:34 +02:00
Rico Sennrich	29cde2a204	allow overriding table filtering in config (required for multimodelcounts)	2013-07-30 16:46:23 +02:00
Rico Sennrich	7b6239b663	multimodelcounts: use Word objects instead of strings in map (avoid costly conversion and string comparison)	2013-07-30 15:03:25 +02:00
Hieu Hoang	03f767ba84	Add debug out to support regression test on Ken's incremental search algorithm. Ken has his own hypothesis class...	2013-07-30 13:05:13 +01:00
Rico Sennrich	ccdcecc86f	multimodel and mosesserver: instead of optimizing first model, select model by name.	2013-07-30 13:54:50 +02:00
Hieu Hoang	b05a443f36	correct arguments to substitute-filtered-tables-and-weights.perl	2013-07-30 11:14:17 +01:00
Ulrich Germann	cb1c06d502	Merge branch 'master' of github.com:moses-smt/mosesdecoder Conflicts: moses/Jamfile	2013-07-28 16:51:13 +01:00
Ulrich Germann	56bb485dd5	Fixed missing #include.	2013-07-28 16:39:13 +01:00
Ulrich Germann	b3ed0d56d7	Fixed missing #include.	2013-07-28 16:38:33 +01:00
Ulrich Germann	a47b6cfafa	Added call to tp->Evaluate(src) before adding a phrase table entry to the TargetPhraseCollection during lookup.	2013-07-28 16:37:20 +01:00
Ulrich Germann	1b1771dcc9	Items under 'generic' now included in libmoses'	2013-07-28 16:30:41 +01:00
Ulrich Germann	a0c13837e0	Fixed computation of lexical scores.	2013-07-28 16:28:41 +01:00
Hieu Hoang	abe90b5af7	Merge branch 'master' of github.com:moses-smt/mosesdecoder	2013-07-27 04:19:16 +01:00
Hieu Hoang	9dab7950fa	move closing of filtered file before binarizing. Otherwise file not flushed, causes error in binarizing	2013-07-27 04:18:50 +01:00
Hieu Hoang	38e312f44c	Merge branch 'master' of github.com:moses-smt/mosesdecoder	2013-07-25 15:55:16 +01:00
Barry Haddow	29aa9ea153	Merge branch 'master' of github.com:moses-smt/mosesdecoder	2013-07-25 15:56:44 +01:00
Barry Haddow	c127c58e9b	fix to single thread build	2013-07-25 15:56:20 +01:00
Hieu Hoang	a3e3289b08	In corpus mode, replace number with number symbol	2013-07-25 15:54:47 +01:00
Barry Haddow	7081f06413	Fixes to the shared build	2013-07-25 15:24:34 +01:00
Hieu Hoang	76a9730ca8	Merge branch 'master' of github.com:moses-smt/mosesdecoder	2013-07-25 15:23:12 +01:00
Hieu Hoang	e2c2bc59f1	beautify	2013-07-25 15:23:05 +01:00
Hieu Hoang	78381d0213	@NUM@ --> @num@. In case using recaser	2013-07-25 15:16:15 +01:00
Phil Williams	f0b603e6b5	extract-ghkm: write glue grammars for all sentence offsets extract-parallel now merges separate glue grammars, so remove previous workaround.	2013-07-25 13:53:32 +01:00
Hieu Hoang	d0172ed5cd	create script to convert phrase-table with alignment in Moses' dead-end format to standard format	2013-07-25 12:56:20 +01:00
Hieu Hoang	018998247a	create script to convert phrase-table with alignment in Moses' dead-end format to standard format	2013-07-25 12:52:05 +01:00
Hieu Hoang	c0aba71c79	bug processing unknown word with digits	2013-07-25 08:41:59 +01:00
Barry Haddow	f79746b3c2	Merge branch 'master' of github.com:moses-smt/mosesdecoder	2013-07-24 20:49:59 +01:00
Hieu Hoang	6fc21a32fc	Merge branch 'master' of github.com:moses-smt/mosesdecoder	2013-07-24 19:01:57 +01:00
Hieu Hoang	c104dee3b2	merge glue grammars, rather than writing them all to the same file. Required by Phil Williams & others when doing syntax extraction	2013-07-24 19:01:46 +01:00
Achim Ruopp	1813f9784b	Additional factoring to allow more NE recognizers; bug fixes	2013-07-24 12:44:53 -04:00
Barry Haddow	46ee1ca42d	More lattice fixes squashed by merge	2013-07-24 16:09:32 +01:00
Barry Haddow	0ce50a4c70	Merge branch 'master' of github.com:moses-smt/mosesdecoder	2013-07-24 15:58:08 +01:00
Phil Williams	1238041f98	Add option to do Penn Treebank style tokenization tokenizer.perl and detokenizer.perl now have an option called -penn which does Penn Treebank-like tokenization (English only). This is useful if your pipeline involves processing the corpus with tools trained on PTB-tokenized text. Unlike PTB, the tokenizer splits on slashes (e.g. "Monday/Tuesday" becomes "Monday", "@/@", "Tuesday"). If using parse-de-berkeley.perl, the option -split-slash re-joins tokens that are separated by slashes for parsing then splits them afterwards.	2013-07-24 13:41:21 +01:00
Kenneth Heafield	71ae8c9d19	LM/Factory.cpp -> FF/Factory.cpp oops	2013-07-24 12:13:11 +01:00
Ian Johnson	68779c66b9	Merge branch 'master' of github.com:moses-smt/mosesdecoder	2013-07-24 11:52:21 +01:00
Ian Johnson	08f64dea28	Arrow pipeline submodules now use https protocol.	2013-07-24 11:52:14 +01:00
Barry Haddow	d5e40a5b08	Merge branch 'master' of github.com:moses-smt/mosesdecoder	2013-07-24 11:38:23 +01:00
Phil Williams	b5584fdecf	extract-ghkm: workaround for extract-parallel issue Don't write glue grammar or unknown word label files unless the sentence offset is 0. This prevents multiple instances of extract-ghkm writing to the same two files when extract-parallel is used. TODO Better solutions might be: 1. modify extract-parallel so that it only configures one instance of extract-ghkm to write the glue / unknown-lhs files (like the current workaround, this assumes file chunks are representative of the whole) 2. add multithreading support directly to extract-ghkm 3. write distinct output files for each extract-ghkm instance and combine them on completion	2013-07-23 14:55:16 +01:00

1 2 3 4 5 ...

10591 Commits