fairseq

mirror of https://github.com/facebookresearch/fairseq.git synced 2024-09-11 17:25:31 +03:00

History

Angela Fan 1c8ab79ca5 quant noise code, readme, start of adding quantization (#1896 ) Summary: FUNCTIONALITY: This diff provides two core pieces of functionality - Adds training with quantization noise from "Training with Quantization Noise for Extreme Model Compression" - controlled by the "quant_noise" and "quant_noise_block_size" parameters. Added in embeddings, attention, FFN for BERT and Transformer LM training - Adds quantization with product quantization based on code from "And the bit goes down: Revisiting the quantization of neural networks" (Stock et al, 2019). This is applied to a fairseq trained model to quantize after training. TODO: -> Pierre, look at quantization code -> int4 and int8 quantization will be added soon. EVALUATED TEST CASES: 0. Training of LM and BERT models starts from scratch with no errors -> yes 1. Retrain LM from scratch with code, no quantization, reproduces Wikitext-103 LM results -> yes, see /checkpoint/angelafan/qn_open_source_noise 2. Reload previously trained LM from scratch, not trained with quant noise, reproduces Wikitext-103 LM results -> yes 3. Train LM from scratch with code, no trained with quant noise, reproduces Wikitext-103 LM results -> yes, see /checkpoint/angelafan/qn_open_source_baseline 4. Train BERT model from scratch with code, no quantization, training curve looks the same as before -> yes 5. Check wps during training and wps during inference, no large change from before -> yes 6. Check structured dropout isn't being applied at eval time -> yes 7. Works in combination with LayerDrop -> yes Pull Request resolved: https://github.com/pytorch/fairseq/pull/1896 Reviewed By: myleott Differential Revision: D20609420 Pulled By: huihuifan fbshipit-source-id: 94468dd811c4caaaef46a9fab2b8d381f9d2b955		2020-04-21 09:28:56 -07:00
..
speech_recognition	refactor namespaces in criterion interface (#1729 )	2020-03-04 16:43:59 -08:00
__init__.py	fairseq-py goes distributed (#106 )	2018-02-27 17:09:42 -05:00
test_average_checkpoints.py	Small fixes	2019-08-19 15:08:25 -07:00
test_backtranslation_dataset.py	Deprecate the SequenceGenerator with the Scripted vision (#1120 )	2020-04-07 13:28:30 -07:00
test_binaries.py	quant noise code, readme, start of adding quantization (#1896 )	2020-04-21 09:28:56 -07:00
test_bmuf.py	Fix BMUF using 1 GPU	2020-04-16 11:25:35 -07:00
test_character_token_embedder.py	Relicense fairseq under MIT license (#786 )	2019-07-30 07:48:23 -07:00
test_concat_dataset.py	Relicense fairseq under MIT license (#786 )	2019-07-30 07:48:23 -07:00
test_convtbc.py	Relicense fairseq under MIT license (#786 )	2019-07-30 07:48:23 -07:00
test_dictionary.py	Allow dictionaries to overwrite entries with #fairseq:overwrite comment (#1073 )	2020-03-08 06:52:00 -07:00
test_export.py	TorchScript support for AANTransformer	2020-04-10 18:23:50 -07:00
test_file_io.py	Added unit test for PathManager file io (with or without fvcore).	2019-12-09 14:19:51 -08:00
test_iterators.py	Relicense fairseq under MIT license (#786 )	2019-07-30 07:48:23 -07:00
test_label_smoothing.py	refactor namespaces in criterion interface (#1729 )	2020-03-04 16:43:59 -08:00
test_lstm_jitable.py	Update Fairseq LSTM to jitable version (#2016 )	2020-04-16 15:49:56 -07:00
test_memory_efficient_fp16.py	Clean up tests	2020-01-22 11:29:20 -08:00
test_metrics.py	Fix logging of training sets (fixes #1632 ) (#1634 )	2020-01-20 16:34:33 -08:00
test_multi_corpus_sampled_dataset.py	Relicense fairseq under MIT license (#786 )	2019-07-30 07:48:23 -07:00
test_multihead_attention.py	Fixing key padding mask during transformer generation	2019-11-05 06:50:53 -08:00
test_noising.py	Relicense fairseq under MIT license (#786 )	2019-07-30 07:48:23 -07:00
test_reproducibility.py	Fix validation happening twice at the end of epoch (#1934 )	2020-04-03 16:38:39 -07:00
test_resampling_dataset.py	Add dataset class for weighted sampling with replacement. (#861 )	2019-09-19 10:36:00 -07:00
test_sequence_generator.py	Script _no_repeat_ngram in fb_simple_sequence_generator (#1963 )	2020-04-10 14:44:42 -07:00
test_sequence_scorer.py	Relicense fairseq under MIT license (#786 )	2019-07-30 07:48:23 -07:00
test_sparse_multihead_attention.py	Relicense fairseq under MIT license (#786 )	2019-07-30 07:48:23 -07:00
test_token_block_dataset.py	Relicense fairseq under MIT license (#786 )	2019-07-30 07:48:23 -07:00
test_train.py	Use 1-based indexing for epochs everywhere (#1053 )	2020-03-04 16:37:24 -08:00
test_utils.py	Fix max_position resolution with tuples having len > 2 (#2028 )	2020-04-21 06:01:14 -07:00
transformer_quantization_config.yaml	quant noise code, readme, start of adding quantization (#1896 )	2020-04-21 09:28:56 -07:00
utils.py	Make TransformerDecoupled model scriptable (#1125 )	2020-04-01 17:53:49 -07:00