Commit Graph

  • 3c6793045b moved results table generation for tatoeba models Joerg Tiedemann 2021-02-15 20:35:29 +0200
  • c46eb49c26 memad models with tatoeba data and some cleanup in tatoeba langgroup language expansion Joerg Tiedemann 2021-02-03 20:49:14 +0200
  • 4305ad01b9 tatoeba model result lists added Joerg Tiedemann 2021-01-18 14:53:54 +0200
  • b5fbc4a52a Merge branch 'master' of github.com:Helsinki-NLP/OPUS-MT-train Joerg Tiedemann 2021-01-14 23:12:46 +0200
  • 385e8298b2 more fixes with evaluation recipes of multilingual tatoeba models Joerg Tiedemann 2021-01-14 23:07:12 +0200
  • f74531468e
    Update README.md tiedemann 2021-01-14 22:31:09 +0200
  • 81ce0bf8c4 fixed a problem with fine-tuning Tatoeba multilingual models for specific language pairs Joerg Tiedemann 2021-01-09 00:29:04 +0200
  • 71d49406eb tutorial links added Joerg Tiedemann 2021-01-07 23:50:33 +0200
  • 527ab54caa subset result tables for Tatoeba now also with reverse translation direction Joerg Tiedemann 2021-01-07 23:19:24 +0200
  • c3be953980 recipe for finetuning multilingual models for specific language pairs (example Tatoeba models) Joerg Tiedemann 2021-01-07 21:45:59 +0200
  • 359657a523 make it possible to update release list Joerg Tiedemann 2021-01-05 00:49:43 +0200
  • 3413c8afe0 added file for released Tatoeba results Joerg Tiedemann 2021-01-05 00:44:52 +0200
  • f574427461 add recipe to release all unfinished Tatoeba models Joerg Tiedemann 2021-01-03 00:55:03 +0200
  • f196818110 renamed some recipes for tatoeba to be more flexible Joerg Tiedemann 2020-12-29 12:42:37 +0200
  • 721527e8ef Merge branch 'master' of github.com:Helsinki-NLP/OPUS-MT-train Joerg Tiedemann 2020-11-26 12:59:21 +0200
  • 7bd502edcc updated model list Joerg Tiedemann 2020-11-26 12:57:46 +0200
  • 474891563a
    Update README.md tiedemann 2020-11-19 12:02:06 +0200
  • 3a66dc6fd4 Merge branch 'master' of github.com:Helsinki-NLP/OPUS-MT-train Joerg Tiedemann 2020-10-27 23:49:39 +0200
  • 1186d9afd5 tico19 benchmark added Joerg Tiedemann 2020-10-27 23:48:09 +0200
  • 40a6b5ab6b fixed bug in release target Joerg Tiedemann 2020-10-04 00:10:11 +0300
  • 79a49e1b9e
    Merge pull request #31 from Helsinki-NLP/sam-suppress-missing-perl-module-warnings-when-installing-fix tiedemann 2020-10-01 23:26:45 +0300
  • 4cc192da15 avoid error messages in data creation when no data files exist for some language pairs Joerg Tiedemann 2020-09-25 10:05:18 +0300
  • c6356d3a8a back to yml vocab files as default Joerg Tiedemann 2020-09-25 09:58:25 +0300
  • 9ee38fe355 Suppress warnings when testing for missing Perl modules sam-suppress-missing-perl-module-warnings-when-installing-fix Traubert 2020-09-24 13:25:14 +0300
  • ec69daa989 Some puhti modules needed for installing sam-puhti-environment-build-fix Traubert 2020-09-24 13:18:18 +0300
  • 21433867ab If we include lib/config.mk before prerequisites are made, make fails sam-fixes Traubert 2020-09-24 12:25:24 +0300
  • 1e349f269c Fix type Traubert 2020-09-24 12:15:08 +0300
  • f9a44bdb99 merged Joerg Tiedemann 2020-09-18 23:08:56 +0300
  • 87a5354de5 changes to tatoeba recipes Joerg Tiedemann 2020-09-18 23:05:46 +0300
  • a61cf48443 add option to skip sentence piecce vocabs but use marian_vocab instead Jörg Tiedemann 2020-09-16 19:33:19 +0300
  • 913d31472e
    Merge pull request #25 from Helsinki-NLP/sam-fixes tiedemann 2020-09-16 09:28:46 +0300
  • c564bd1f56 fix in fetching data for Sami languages Joerg Tiedemann 2020-09-16 09:25:36 +0300
  • 1da54c4155 Fix typo Traubert 2020-09-15 15:02:20 +0300
  • 58fbf0bdd8 back to old subword model names Jörg Tiedemann 2020-09-14 08:53:57 +0300
  • c2798e9758 plain text vocab files from spm models Jörg Tiedemann 2020-09-13 22:17:21 +0300
  • 24e92de56a proper release packages for models with internal sentence piece vocabs Jörg Tiedemann 2020-09-13 00:00:15 +0300
  • 666b2b8462 internal sentence piece models in transformers Jörg Tiedemann 2020-09-12 16:16:01 +0300
  • ddafb43d66 removed dependence on moses tools in preprocessing script for released spm packages Jörg Tiedemann 2020-09-12 14:42:10 +0300
  • c0cb356417 added acknowledgements Jörg Tiedemann 2020-09-12 12:01:02 +0300
  • 16eef8e45d moved project makefiles to lib/projects Jörg Tiedemann 2020-09-10 12:12:44 +0300
  • 1a6e29275d dev data is now uniq to avoid overlaps with test data Jörg Tiedemann 2020-09-09 23:21:07 +0300
  • 3735af4ec1 more documentation Jörg Tiedemann 2020-09-07 23:00:01 +0300
  • 3367ad2e34 documentation of low-resource languages Jörg Tiedemann 2020-09-06 23:56:16 +0300
  • 909e525a2d keep translations even if uncomplete in pivoting Jörg Tiedemann 2020-09-06 00:22:48 +0300
  • a47c292152 pivoting and documentation Jörg Tiedemann 2020-09-05 22:19:00 +0300
  • ad828c3124 started tutorial and fixes to backtranslate makefile Jörg Tiedemann 2020-09-05 00:16:22 +0300
  • d11f74ce41 added bpe submodule Tiedemann Jörg 2020-09-04 15:34:20 +0300
  • 96eaad2d05 added possibility to fetch moses file from ObjectStore (instead of reading with opus_read) Tiedemann 2020-09-03 22:04:44 +0300
  • 971ece9606 fix tatoeba data labels Joerg Tiedemann 2020-09-03 07:55:44 +0300
  • 7e97f4bc19 setup and installation information added Tiedemann 2020-09-02 16:49:22 +0300
  • 1435b7849a moved allas recipes to a different makefile Tiedemann 2020-09-02 16:35:35 +0300
  • 2332732577 make compatible with mac osx and include submodules for required tools Tiedemann 2020-09-02 15:52:34 +0300
  • cde8e65a5b new buckets for fetch and store, uncompressed now and follow-links Joerg Tiedemann 2020-09-01 16:15:01 +0300
  • 1a279ce6f1 started documentation of project specific models Joerg Tiedemann 2020-08-28 15:53:23 +0300
  • 639bd2adda started documentation of project specific models Joerg Tiedemann 2020-08-28 15:51:37 +0300
  • 2c04e48dbe fixed an important bug in data merging Joerg Tiedemann 2020-08-28 11:52:46 +0300
  • e31550a3ad enabled fetching OPUS data instead of reading local files if necessary Joerg Tiedemann 2020-08-28 10:53:11 +0300
  • 94eeec13eb take away dependence on local OPUS files for finding data Joerg Tiedemann 2020-08-27 22:36:50 +0300
  • 831ee89f76 fixed bug in env.mk Joerg Tiedemann 2020-08-26 22:18:12 +0300
  • 596dd993a5 more documentation Joerg Tiedemann 2020-08-26 21:45:03 +0300
  • a8b54f5311 some info about training added Joerg Tiedemann 2020-08-26 15:12:38 +0300
  • 2f8a37cc92 more details about data compilation added Joerg Tiedemann 2020-08-26 14:31:50 +0300
  • dac6070069 started some more documentation Joerg Tiedemann 2020-08-26 09:59:24 +0300
  • f2a413b740 minor cleanup in env Joerg Tiedemann 2020-08-26 01:01:44 +0300
  • 4c35456038 cleanup in data makefile Joerg Tiedemann 2020-08-26 00:44:02 +0300
  • 9375f37886 missing makefile added Joerg Tiedemann 2020-08-25 22:42:33 +0300
  • 308bf647f0 fetchdata src and dest dir Joerg Tiedemann 2020-08-25 22:07:32 +0300
  • 1e23566f30 fixed fetch and store from and to allas Joerg Tiedemann 2020-08-23 10:08:06 +0300
  • 0e27198048 store and fetch work data Joerg Tiedemann 2020-08-22 23:51:37 +0300
  • d7252e32b7 tatoeba monolingual data Joerg Tiedemann 2020-08-05 00:00:24 +0300
  • 6bf0207cc6 list of models added Joerg Tiedemann 2020-08-03 11:58:51 +0300
  • c9fcb7f35d tatoeba langgroup models Joerg Tiedemann 2020-08-02 11:38:42 +0300
  • 1b913277b3 tatoeba language group models with various sample sizews Joerg Tiedemann 2020-07-25 22:52:33 +0300
  • 5493aeddb4 fixed a problem with lang group targets Joerg Tiedemann 2020-07-14 21:40:49 +0300
  • 068b82cc1d fixed a bug in eval/dist groups Joerg Tiedemann 2020-07-14 13:29:06 +0300
  • e2edc4195a result tables for language groups and minor fixes for start scripts in Tatoeba challenge Joerg Tiedemann 2020-07-10 11:59:37 +0300
  • ec6d7c7142 tatoeba langgroups Joerg Tiedemann 2020-07-04 23:37:39 +0300
  • 7df91a9eaa language group jobs with some more documentation Joerg Tiedemann 2020-06-29 12:26:45 +0300
  • 62c9414122 lang groups Joerg Tiedemann 2020-06-29 00:15:35 +0300
  • 46a0b2b15a fixed dist-packaging Joerg Tiedemann 2020-06-27 13:56:51 +0300
  • e2bc2acb3b re-organised targets for multilingual models of language groups Joerg Tiedemann 2020-06-27 12:29:50 +0300
  • 9e186d82d6 bugfix in tatoeba data extraction for multilingual data files (language code clash) Joerg Tiedemann 2020-06-25 00:45:25 +0300
  • 844f8bf72a removed unnecessary pre-processing for chinese Joerg Tiedemann 2020-06-19 16:12:06 +0300
  • b7f45e2a74 more details in model config Joerg Tiedemann 2020-06-18 20:50:22 +0300
  • 4e18da6e4c fix chinese/korean/japanese language codes Joerg Tiedemann 2020-06-17 22:02:39 +0300
  • e141772b34 fixed multilingual tatoeba evaluation Joerg Tiedemann 2020-06-11 00:54:40 +0300
  • cc16be10d4 final fixes to multilingual tatoeba model scripts Joerg Tiedemann 2020-06-09 11:19:58 +0300
  • b7691875c2 tatoeba models now operational Joerg Tiedemann 2020-06-09 00:12:16 +0300
  • 035cca7c1a fixed tatoeba model scripts Joerg Tiedemann 2020-06-08 17:24:39 +0300
  • e07eb14984 fit-data-size fixed Joerg Tiedemann 2020-06-08 14:14:55 +0300
  • 6cb9959e82 tatoeba challenge model scripts updated Joerg Tiedemann 2020-06-06 20:49:54 +0300
  • edaf361803 multilingual tatoeba models and some documentation added Joerg Tiedemann 2020-06-03 15:39:18 +0300
  • c44e92d52a fixed bug in tatoeba model call Joerg Tiedemann 2020-06-03 01:09:28 +0300
  • eeaef7768c tatoeba models added Joerg Tiedemann 2020-06-03 00:16:21 +0300
  • ec43fcd30a fixed a bug in eval-testsets Joerg Tiedemann 2020-05-29 14:43:36 +0300
  • d0a217cf40 wikimatrix models added Joerg Tiedemann 2020-05-21 20:51:38 +0300
  • 716d7b52c1 fixed testset names and backtranslation sentence splitting Joerg Tiedemann 2020-05-20 23:19:48 +0300
  • 04d72ff8ed fixes with pivoting Joerg Tiedemann 2020-05-18 21:36:53 +0300
  • b01b4f22c3 pivot-based translations added Joerg Tiedemann 2020-05-17 22:43:05 +0300
  • 1246bcd271 added some size info to train data README Joerg Tiedemann 2020-05-17 01:21:57 +0300