fairseq/tests
Vimal Manohar 8feccf9441 EMA
Summary:
Adds Exponential moving average (EMA) model for Kaizen semi-supervised training https://arxiv.org/abs/2106.07759

1. Add `ema.store_ema` to enable storing EMA. EMA will be written to extra_state in the state dict while saving checkpoint.
2. `ema.ema_start_update` to control when the EMA starts accumulating
3. Tasks can use `uses_ema` property to decide if the EMA should be passed to the task. (Default is False)
4. `load_ema_from_checkpoint` can be used to load EMA model in place of the model to be used for evalutation. Pyspeech has eval-ema option for this.

```
This module has the EMA class used to store a copy of the exponentially decayed
model params.

Typical usage of EMA class involves initializing an object using an existing
model (random or from a seed model) and setting the config like ema_decay,
ema_start_update which determine how the EMA model is updated. After every
update of the model i.e. at the end of the train_step, the EMA should be updated
by passing the new model to the EMA.step function. The EMA model state dict
can be stored in the extra state under the key of "ema" and dumped
into a checkpoint and loaded. The EMA object can be passed to tasks
by setting task.uses_ema property.
EMA is a smoothed/ensemble model which might have better performance
when used for inference or further fine-tuning. EMA class has a
reverse function to load the EMA params into a model and use it
like a regular model.
```

Reviewed By: cruvadom

Differential Revision: D24238379

fbshipit-source-id: 879d3ba5070a614b7d365f9503af357001e875b2
2021-09-01 12:29:51 -07:00
..
distributed Add tests for fairseq.distributed.utils.all_gather_list (#1548) 2021-01-28 14:21:10 -08:00
gpu EMA 2021-09-01 12:29:51 -07:00
speech_recognition Enable Hydra configs in fairseq (#1343) (#1510) 2020-10-20 00:32:26 -07:00
__init__.py remediation of S205607 2020-07-17 17:21:51 -07:00
test_activation_checkpointing.py Make checkpoint wrapper pickleable (#1603) 2021-02-06 08:07:32 -08:00
test_amp_optimizer.py Add torch.cuda.amp support (#3460) 2021-05-26 14:39:10 -07:00
test_average_checkpoints.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_backtranslation_dataset.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_binaries.py MultiGPU test and --log-file workaround (#1793) 2021-04-21 06:39:00 -07:00
test_character_token_embedder.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_checkpoint_utils.py Move checkpoint state_dict creation into Trainer (#1666) 2021-03-04 13:32:44 -08:00
test_concat_dataset.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_constraints.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_convtbc.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_data_utils.py batch_by_size refactoring: 100x speedup and optimization of memory footprint 2020-12-28 21:05:51 -08:00
test_dataclass_utils.py Hierarchical Configs 2021-07-16 04:56:12 -07:00
test_dataset.py Add support for FullyShardedDataParallel (--ddp-backend=fully_sharded) (#1667) 2021-03-04 13:32:46 -08:00
test_dictionary.py Extract File Chunking to its own utils (#1955) 2021-06-28 01:46:32 -07:00
test_ema.py EMA 2021-09-01 12:29:51 -07:00
test_export.py Improve torchscript compatibility of transfomer and transformer pg (#3247) 2021-02-22 14:22:54 -08:00
test_file_chunker_utils.py Extract File Chunking to its own utils (#1955) 2021-06-28 01:46:32 -07:00
test_file_io.py Delete line that breaks gh ci (#1814) 2021-04-19 16:31:11 -07:00
test_fp16_optimizer.py end to end hydra configs (#1393) 2020-11-04 18:20:12 -08:00
test_huffman.py Indexed Huffman Coded dataset (#2029) 2021-08-31 01:12:35 -07:00
test_inference_dropout.py Enable Hydra configs in fairseq (#1343) (#1510) 2020-10-20 00:32:26 -07:00
test_iopath.py Support atomic saves for checkpoints (#1520) 2020-12-18 07:40:49 -08:00
test_iterators.py Simplify CountingIterator 2021-04-29 16:17:00 -07:00
test_label_smoothing.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_lm_context_window.py Fix --context-window and add test (#1526) 2020-12-23 18:35:54 -08:00
test_lstm_jitable.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_memory_efficient_fp16.py Enable Hydra configs in fairseq (#1343) (#1510) 2020-10-20 00:32:26 -07:00
test_metrics.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_multi_corpus_dataset.py optimize sampling process of multi_corpus_dataset 2021-03-03 19:31:40 -08:00
test_multi_corpus_sampled_dataset.py Relicense fairseq under MIT license (#786) 2019-07-30 07:48:23 -07:00
test_multihead_attention.py Adding check for filler size (#3495) 2021-04-21 09:09:19 -07:00
test_noising.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_online_backtranslation.py Obt 2 (#1614) 2021-03-30 09:56:03 -07:00
test_plasma_utils.py Plasma tests: ask for less disk (#1893) 2021-05-24 09:00:18 -07:00
test_reproducibility.py Add torch.cuda.amp support (#3460) 2021-05-26 14:39:10 -07:00
test_resampling_dataset.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_roberta.py Obt 2 (#1614) 2021-03-30 09:56:03 -07:00
test_sequence_generator.py fix beam search with prefix tokens (#2227) 2021-08-30 18:07:13 -07:00
test_sequence_scorer.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_sparse_multihead_attention.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_token_block_dataset.py TokenBlockDataset np type promotion issue (#1658) 2021-02-26 21:00:38 -08:00
test_train.py fixes tests/test_train.py to mock checkpoint.save_dir config node (#3675) 2021-07-06 15:07:31 -07:00
test_transformer.py fix MultiHeadAttention assert (#1798) 2021-04-14 04:59:59 -07:00
test_utils.py Apply black+isort (#1357) 2020-10-18 18:14:51 -07:00
test_valid_subset_checks.py Migrate DummyMaskedLMTask to FairseqTask (#3593) 2021-06-10 09:43:08 -07:00
utils.py MultiGPU test and --log-file workaround (#1793) 2021-04-21 06:39:00 -07:00