fairseq/tests
Myle Ott 937535dba0 Allow dictionaries to overwrite entries with #fairseq:overwrite comment (#1073)
Summary:
[This commit](dd1298e15f) made it so that duplicate entries in a dictionary are ignored. Unfortunately the Camembert model depends on overwriting `<unk>`, `<s>` and `</s>`.

The proposed solution here is to allow the dictionary to have entries like:
```
<unk> 999 #fairseq:overwrite
<s> 999 #fairseq:overwrite
</s> 999 #fairseq:overwrite
, 999
▁de 999
. 999
(...)
```

These will preserve the old overwriting behavior. Thus we can release a new `camembert.v0.tar.gz` with a dictionary like above and it works.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1073

Reviewed By: kahne

Differential Revision: D20284569

Pulled By: myleott

fbshipit-source-id: bf78fbff13c94bf8a6485cbdda62305ddc30c056
2020-03-08 06:52:00 -07:00
..
speech_recognition refactor namespaces in criterion interface (#1729) 2020-03-04 16:43:59 -08:00
__init__.py fairseq-py goes distributed (#106) 2018-02-27 17:09:42 -05:00
test_average_checkpoints.py Small fixes 2019-08-19 15:08:25 -07:00
test_backtranslation_dataset.py Add a diverse beam search variant to sequence_generator.py (#953) 2020-01-06 08:24:02 -08:00
test_binaries.py Move MoE files into examples (#1040) 2020-02-21 14:13:37 -08:00
test_bmuf.py add vq-wav2vec (#1029) 2020-02-29 18:25:34 -08:00
test_character_token_embedder.py Relicense fairseq under MIT license (#786) 2019-07-30 07:48:23 -07:00
test_concat_dataset.py Relicense fairseq under MIT license (#786) 2019-07-30 07:48:23 -07:00
test_convtbc.py Relicense fairseq under MIT license (#786) 2019-07-30 07:48:23 -07:00
test_dictionary.py Allow dictionaries to overwrite entries with #fairseq:overwrite comment (#1073) 2020-03-08 06:52:00 -07:00
test_export.py Add save and load tests to fairseq export test (#1653) 2020-01-30 16:14:35 -08:00
test_file_io.py Added unit test for PathManager file io (with or without fvcore). 2019-12-09 14:19:51 -08:00
test_iterators.py Relicense fairseq under MIT license (#786) 2019-07-30 07:48:23 -07:00
test_label_smoothing.py refactor namespaces in criterion interface (#1729) 2020-03-04 16:43:59 -08:00
test_memory_efficient_fp16.py Clean up tests 2020-01-22 11:29:20 -08:00
test_metrics.py Fix logging of training sets (fixes #1632) (#1634) 2020-01-20 16:34:33 -08:00
test_multi_corpus_sampled_dataset.py Relicense fairseq under MIT license (#786) 2019-07-30 07:48:23 -07:00
test_multihead_attention.py Fixing key padding mask during transformer generation 2019-11-05 06:50:53 -08:00
test_noising.py Relicense fairseq under MIT license (#786) 2019-07-30 07:48:23 -07:00
test_reproducibility.py Script MultiheadAttention (#1002) 2020-01-21 18:35:28 -08:00
test_resampling_dataset.py Add dataset class for weighted sampling with replacement. (#861) 2019-09-19 10:36:00 -07:00
test_sequence_generator.py Add a diverse beam search variant to sequence_generator.py (#953) 2020-01-06 08:24:02 -08:00
test_sequence_scorer.py Relicense fairseq under MIT license (#786) 2019-07-30 07:48:23 -07:00
test_sparse_multihead_attention.py Relicense fairseq under MIT license (#786) 2019-07-30 07:48:23 -07:00
test_token_block_dataset.py Relicense fairseq under MIT license (#786) 2019-07-30 07:48:23 -07:00
test_train.py Use 1-based indexing for epochs everywhere (#1053) 2020-03-04 16:37:24 -08:00
test_utils.py Relicense fairseq under MIT license (#786) 2019-07-30 07:48:23 -07:00
utils.py Rewrite the unit test of sequence generator 2020-02-26 11:09:20 -08:00