Commit Graph

37 Commits

Author SHA1 Message Date
Joerg Tiedemann
d62f74dcc3 changes on puhti 2022-04-27 10:53:43 +03:00
Joerg Tiedemann
98e9059b3b merged 2022-03-17 22:34:52 +02:00
Joerg Tiedemann
f1a8153431 elg changes on puhti 2022-03-17 22:03:04 +02:00
Joerg Tiedemann
dee0f6b951 elg project stuff and changes done on mahti 2022-03-17 21:02:11 +02:00
Joerg Tiedemann
827af80bf6 new scores 2022-03-14 08:25:34 +02:00
Joerg Tiedemann
c421fbdb15 fixes with names 2022-02-26 17:57:49 +02:00
Joerg Tiedemann
e94c43062a linking latest release is now correct 2022-02-26 00:20:31 +02:00
Joerg Tiedemann
8cd6a7b246 student models for tatoeba 2022-01-25 22:43:48 +02:00
Joerg Tiedemann
48be7e8c66 quantization recipies added 2022-01-21 14:44:47 +02:00
Joerg Tiedemann
b4741351d5 tatoeba models 2021-12-22 17:31:22 +02:00
Joerg Tiedemann
d617a63c76 cleanup in tatoeba data recipes 2021-12-18 00:27:04 +02:00
Joerg Tiedemann
ae9637b09c avoid setting TATOEBA_DATASET recursively 2021-10-10 20:20:31 +03:00
Joerg Tiedemann
07cae1b0a5 pivotlang option added to tatoeba langgroup models and removed raw-langcodes in dist packages 2021-10-10 00:57:03 +03:00
Joerg Tiedemann
200662863e added recipes for tatoeba models other than English 2021-05-04 08:49:16 +03:00
Joerg Tiedemann
cde8f0d0af balance dev data in multiligual models and a bug fixed in preprocess script 2021-03-30 00:00:28 +03:00
Joerg Tiedemann
3cd0bd3f75 create vocabulary files from spm models) 2021-03-14 22:05:21 +02:00
Joerg Tiedemann
bb39c060c0 added recipe for refreshing release info 2021-03-13 00:29:23 +02:00
Joerg Tiedemann
098509d257 fixed language label problem in tatoeba model training recipes 2021-02-19 23:56:21 +02:00
Joerg Tiedemann
ca2a249845 bugfixing in tatoeba MT model recipes 2021-02-18 13:49:16 +02:00
Joerg Tiedemann
53c5680268 fixed tatoeba group recipes 2021-02-16 20:36:00 +02:00
Joerg Tiedemann
3c6793045b moved results table generation for tatoeba models 2021-02-15 20:35:29 +02:00
Joerg Tiedemann
1186d9afd5 tico19 benchmark added 2020-10-27 23:48:09 +02:00
Joerg Tiedemann
40a6b5ab6b fixed bug in release target 2020-10-04 00:10:11 +03:00
Jörg Tiedemann
58fbf0bdd8 back to old subword model names 2020-09-14 08:53:57 +03:00
Jörg Tiedemann
c2798e9758 plain text vocab files from spm models 2020-09-13 22:17:21 +03:00
Jörg Tiedemann
24e92de56a proper release packages for models with internal sentence piece vocabs 2020-09-13 00:00:15 +03:00
Jörg Tiedemann
666b2b8462 internal sentence piece models in transformers 2020-09-12 16:16:01 +03:00
Jörg Tiedemann
ddafb43d66 removed dependence on moses tools in preprocessing script for released spm packages 2020-09-12 14:42:10 +03:00
Jörg Tiedemann
a47c292152 pivoting and documentation 2020-09-05 22:19:00 +03:00
Joerg Tiedemann
94eeec13eb take away dependence on local OPUS files for finding data 2020-08-27 22:36:50 +03:00
Joerg Tiedemann
46a0b2b15a fixed dist-packaging 2020-06-27 13:56:51 +03:00
Joerg Tiedemann
844f8bf72a removed unnecessary pre-processing for chinese 2020-06-19 16:12:06 +03:00
Joerg Tiedemann
4e18da6e4c fix chinese/korean/japanese language codes 2020-06-17 22:02:39 +03:00
Joerg Tiedemann
716d7b52c1 fixed testset names and backtranslation sentence splitting 2020-05-20 23:19:48 +03:00
Joerg Tiedemann
7ef908dcd7 translate with backtranslations 2020-05-13 00:41:07 +03:00
Joerg Tiedemann
e4455e510a a bit more info added for data sets 2020-05-09 22:33:33 +03:00
Joerg Tiedemann
6b8e69269a better division of the massive tasks makefile 2020-05-03 20:27:55 +03:00