Commit Graph

349 Commits

Author SHA1 Message Date
Joerg Tiedemann
c94abcbb3f finetuned packages 2020-03-21 21:36:29 +02:00
Joerg Tiedemann
87551ac387 target for extracting text from all wikis 2020-03-20 15:32:29 +02:00
Joerg Tiedemann
fd6db4e93a new marian and fixed path to mono lang check in backtranslation 2020-03-19 20:42:27 +02:00
Joerg Tiedemann
c573551713 new models 2020-03-02 16:59:31 +02:00
Joerg Tiedemann
233083a8b8 simplification evaluation with BLEU 2020-03-01 00:25:25 +02:00
Joerg Tiedemann
d13a9461f0 simplification evaluation with BLEU 2020-03-01 00:25:05 +02:00
Joerg Tiedemann
3f57e4f873 lang specific cleanup scripts are now possible 2020-02-29 18:23:08 +02:00
Joerg Tiedemann
f5111a27a7 lang specific cleanup scripts are now possible 2020-02-29 18:19:57 +02:00
Joerg Tiedemann
0ff0e625d5 train text simplification model 2020-02-29 17:59:27 +02:00
Joerg Tiedemann
2805bf49e7 latest models 2020-02-26 21:20:17 +02:00
Joerg Tiedemann
44182291dc skip word alignment if not necessary 2020-02-25 09:00:24 +02:00
Joerg Tiedemann
d08fdd4040 finetune README improved 2020-02-16 00:07:53 +02:00
Joerg Tiedemann
1a7cbbb13e finetune branch downloads models from object storage 2020-02-15 23:40:55 +02:00
Joerg Tiedemann
325b4c1903 latest models 2020-02-14 15:38:12 +02:00
Joerg Tiedemann
0e893a06e0 finetuning for fi-en 2020-02-14 00:12:55 +02:00
Joerg Tiedemann
870804f4ee finetuning anc backtranslations 2020-02-11 23:20:11 +02:00
Joerg Tiedemann
4b7ae1a39b added function to convert TMX file for fine-tuning (requires OpusTools-perl) 2020-02-10 21:49:44 +02:00
Joerg Tiedemann
8ff98705b7 vocabs from monolingual and character coverage in sentence piece models depending on size of alphabet 2020-02-08 18:10:38 +02:00
Joerg Tiedemann
811815064b new mode: SentencePieceModels trained on monolingual data 2020-02-08 15:21:37 +02:00
Joerg Tiedemann
ee8c27e3db removed punctuation normalisation and added language filter 2020-02-08 00:19:21 +02:00
Joerg Tiedemann
91576aa3e9 Merge branch 'master' of github.com:Helsinki-NLP/OPUS-MT-train 2020-02-05 13:21:35 +02:00
Joerg Tiedemann
b52bbb676a makefile update 2020-02-05 13:20:33 +02:00
tiedemann
d57dc52e7f
Merge pull request #2 from veer66/docker
Add a Dockerfile for training on CPU
2020-02-04 10:26:21 +02:00
Vee Satayamas
07dfd31259 Add more aligners 2020-02-03 16:03:26 +07:00
Vee Satayamas
a759e50d2f Merge branch 'docker' of github.com:veer66/OPUS-MT-train into docker 2020-02-03 15:42:18 +07:00
Vee Satayamas
683d9a5a66 Add Dockerfile for GPU 2020-02-03 15:41:34 +07:00
Vee Satayamas
fdf6f21b19 Add a Dockerfile for training on CPU 2020-01-30 09:02:29 +00:00
Joerg Tiedemann
106b06aa4c avoid uploading linked dist files 2020-01-29 21:46:18 +02:00
Joerg Tiedemann
bb5532ab71 new models 2020-01-24 13:39:21 +02:00
Joerg Tiedemann
bb98f03df5 backtranslate bugfix 2020-01-22 13:33:28 +02:00
Joerg Tiedemann
a0d8140cf2 latest models added 2020-01-21 11:21:05 +02:00
Joerg Tiedemann
f32ddd06ce allwikis 2020-01-20 23:37:40 +02:00
Joerg Tiedemann
f97bc1895c fixed model names 2020-01-20 00:37:24 +02:00
Joerg Tiedemann
2887762198 bugfixing and optimising makefiles 2020-01-19 19:00:13 +02:00
Joerg Tiedemann
596cae8922 train with backtranslations 2020-01-18 20:37:01 +02:00
Joerg Tiedemann
0185534823 pre-processing scripts fixed 2020-01-17 13:42:18 +02:00
Joerg Tiedemann
831acb1ae7 Tatoeba test sets added 2020-01-17 12:43:53 +02:00
Joerg Tiedemann
b0a586cbd0 new models added 2020-01-16 16:37:46 +02:00
Joerg Tiedemann
58690950b0 all models = opus 2020-01-15 23:18:07 +02:00
Joerg Tiedemann
f749bd7a87 goethe test setting 2020-01-12 22:08:50 +02:00
Joerg Tiedemann
9b43d4cf81 evaluate arbitrary models 2020-01-12 18:31:40 +02:00
Joerg Tiedemann
62af78cbd8 evaluate arbitrary models 2020-01-12 18:04:36 +02:00
Joerg Tiedemann
5fd415e69b default name always opus 2020-01-12 01:25:14 +02:00
Joerg Tiedemann
e2ed3d85d1 finetuning and backtranslation 2020-01-12 01:10:53 +02:00
Joerg Tiedemann
fe16a0c4dd backtranslation scripts 2020-01-11 00:29:06 +02:00
Joerg Tiedemann
1178dadf8d fix some spm models 2020-01-10 21:58:42 +02:00
Joerg Tiedemann
93ab075d3f consistent BPW/SPM models 2020-01-10 18:17:12 +02:00
Joerg Tiedemann
84ac4e5a8b fixed license 2020-01-10 17:04:04 +02:00
Joerg Tiedemann
b36d9a3e22 initial import 2020-01-10 16:45:42 +02:00