.. | ||
README.md |
opus+nt+bt-2021-03-30.zip
- dataset: opus+nt+bt
- model: transformer-align
- source language(s): en
- target language(s): bcl
- model: transformer-align
- pre-processing: normalization + SentencePiece (spm12k,spm32k)
- download: opus+nt+bt-2021-03-30.zip
Training data: opus+nt+bt
- en-bcl: JW300 (470468) new-testament (11623) wiki.aa (43432)
- en-bcl: total size = 525523
- total size (opus+nt+bt): 525475
Validation data
-
bcl-en: wikimedia, 1153
-
total-size-shuffled: 775
-
devset-selected: top 250 lines of wikimedia.src.shuffled!
-
testset-selected: next 525 lines of wikimedia.src.shuffled!
-
devset-unused: added to traindata
-
test set translations: opus+nt+bt-2021-03-30.test.txt
-
test set scores: opus+nt+bt-2021-03-30.eval.txt
Benchmarks
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
wikimedia.en-bcl | 17.3 | 0.426 | 525 | 28399 | 0.840 |
opus+nt+bt+bt-2021-04-01.zip
- dataset: opus+nt+bt+bt
- model: transformer-align
- source language(s): en
- target language(s): bcl
- model: transformer-align
- pre-processing: normalization + SentencePiece (spm12k,spm32k)
- download: opus+nt+bt+bt-2021-04-01.zip
Training data: opus+nt+bt+bt
- en-bcl: JW300 (470468) new-testament (11623) wiki.aa (45474)
- en-bcl: total size = 527565
- total size (opus+nt+bt+bt): 527524
Validation data
-
bcl-en: wikimedia, 1153
-
total-size-shuffled: 775
-
devset-selected: top 250 lines of wikimedia.src.shuffled!
-
testset-selected: next 525 lines of wikimedia.src.shuffled!
-
devset-unused: added to traindata
-
test set translations: opus+nt+bt+bt-2021-04-01.test.txt
-
test set scores: opus+nt+bt+bt-2021-04-01.eval.txt
Benchmarks
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
wikimedia.en-bcl | 21.6 | 0.476 | 525 | 28399 | 0.789 |
opus+nt+bt+bt+bt-2021-04-03.zip
- dataset: opus+nt+bt+bt+bt
- model: transformer-align
- source language(s): en
- target language(s): bcl
- model: transformer-align
- pre-processing: normalization + SentencePiece (spm12k,spm32k)
- download: opus+nt+bt+bt+bt-2021-04-03.zip
Training data: opus+nt+bt+bt+bt
- en-bcl: JW300 (470468) new-testament (11623) wiki.aa (45474)
- en-bcl: total size = 527565
- total size (opus+nt+bt+bt+bt): 527496
Validation data
-
bcl-en: wikimedia, 1153
-
total-size-shuffled: 775
-
devset-selected: top 250 lines of wikimedia.src.shuffled!
-
testset-selected: next 525 lines of wikimedia.src.shuffled!
-
devset-unused: added to traindata
-
test set translations: opus+nt+bt+bt+bt-2021-04-03.test.txt
-
test set scores: opus+nt+bt+bt+bt-2021-04-03.eval.txt
Benchmarks
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
wikimedia.en-bcl | 22.7 | 0.482 | 525 | 28399 | 0.895 |
opus2+nt+bt+bt+bt-2021-04-03.zip
- dataset: opus2+nt+bt+bt+bt
- model: transformer-align
- source language(s): en
- target language(s): bcl
- model: transformer-align
- pre-processing: normalization + SentencePiece (spm12k,spm32k)
- download: opus2+nt+bt+bt+bt-2021-04-03.zip
Training data: opus2+nt+bt+bt+bt
- en-bcl: JW300 (470468) new-testament (11623) wiki.aa (45474) wiki.aa_opus+nt+bt-2021-04-01 (45474)
- en-bcl: total size = 573039
- total size (opus2+nt+bt+bt+bt): 572969
Validation data
-
bcl-en: wikimedia, 1153
-
total-size-shuffled: 775
-
devset-selected: top 250 lines of wikimedia.src.shuffled!
-
testset-selected: next 525 lines of wikimedia.src.shuffled!
-
devset-unused: added to traindata
-
test set translations: opus2+nt+bt+bt+bt-2021-04-03.test.txt
-
test set scores: opus2+nt+bt+bt+bt-2021-04-03.eval.txt
Benchmarks
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
wikimedia.en-bcl | 23.9 | 0.497 | 525 | 28399 | 0.820 |
opus+nt+bt+bt+bt+bt-2021-04-06.zip
- dataset: opus+nt+bt+bt+bt+bt
- model: transformer-align
- source language(s): en
- target language(s): bcl
- model: transformer-align
- pre-processing: normalization + SentencePiece (spm12k,spm32k)
- download: opus+nt+bt+bt+bt+bt-2021-04-06.zip
Training data: opus+nt+bt+bt+bt+bt
- en-bcl: JW300 (470468) new-testament (11623) wiki.aa (45474) wiki.aa_opus+nt+bt+bt-2021-04-03 (45474) wiki.aa_opus+nt+bt-2021-04-01 (45474)
- en-bcl: total size = 618513
- total size (opus+nt+bt+bt+bt+bt): 618427
Validation data
-
bcl-en: wikimedia, 1153
-
total-size-shuffled: 775
-
devset-selected: top 250 lines of wikimedia.src.shuffled!
-
testset-selected: next 525 lines of wikimedia.src.shuffled!
-
devset-unused: added to traindata
-
test set translations: opus+nt+bt+bt+bt+bt-2021-04-06.test.txt
-
test set scores: opus+nt+bt+bt+bt+bt-2021-04-06.eval.txt
Benchmarks
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
wikimedia.en-bcl | 24.4 | 0.498 | 525 | 28399 | 0.805 |
opus+nt+bt+bt-2021-04-10.zip
- dataset: opus+nt+bt+bt
- model: transformer-align
- source language(s): en
- target language(s): bcl
- model: transformer-align
- pre-processing: normalization + SentencePiece (spm12k,spm32k)
- download: opus+nt+bt+bt-2021-04-10.zip
Training data: opus+nt+bt+bt
- en-bcl: JW300 (470468) new-testament (11623) wiki.aa (45494) wiki.aa_opus+nt+bt+bt+bt-2021-04-05 (45474) wiki.aa_opus+nt+bt+bt-2021-04-03 (45474) wiki.aa_opus+nt+bt-2021-04-01 (45474)
- en-bcl: total size = 664007
- unused dev/test data is added to training data
- total size (opus+nt+bt+bt): 665111
Validation data
-
bcl-en: wikimedia, 2767
-
total-size-shuffled: 1966
-
devset-selected: top 250 lines of wikimedia.src.shuffled!
-
testset-selected: next 500 lines of wikimedia.src.shuffled!
-
devset-unused: added to traindata
-
test set translations: opus+nt+bt+bt-2021-04-10.test.txt
-
test set scores: opus+nt+bt+bt-2021-04-10.eval.txt
Benchmarks
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
wikimedia.en-bcl | 30.7 | 0.572 | 500 | 29131 | 0.921 |
opus+nt+bt-2021-04-11.zip
- dataset: opus+nt+bt
- model: transformer-align
- source language(s): en
- target language(s): bcl
- model: transformer-align
- pre-processing: normalization + SentencePiece (spm12k,spm32k)
- download: opus+nt+bt-2021-04-11.zip
Training data: opus+nt+bt
- en-bcl: JW300 (470468) new-testament (11623) wiki.aa (45494) wiki.aa_opus+nt+bt+bt+bt-2021-04-05 (45474) wiki.aa_opus+nt+bt+bt-2021-04-03 (45474) wiki.aa_opus+nt+bt-2021-04-01 (45474)
- en-bcl: total size = 664007
- unused dev/test data is added to training data
- total size (opus+nt+bt): 666118
Validation data
-
bcl-en: wikimedia, 5033
-
total-size-shuffled: 4207
-
devset-selected: top 1000 lines of wikimedia.src.shuffled!
-
testset-selected: next 1000 lines of wikimedia.src.shuffled!
-
devset-unused: added to traindata
-
test set translations: opus+nt+bt-2021-04-11.test.txt
-
test set scores: opus+nt+bt-2021-04-11.eval.txt
Benchmarks
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
wikimedia.en-bcl | 31.9 | 0.585 | 1000 | 27681 | 1.000 |