fairseq

mirror of https://github.com/facebookresearch/fairseq.git synced 2024-10-04 04:37:58 +03:00

Author	SHA1	Message	Date
Piyush Kansal	7409af7f9a	Keep task level checkpoint key name generic (#5330 )	2023-09-15 19:15:19 -04:00
Piyush Kansal	e29f53bfea	initial revision (#5328 )	2023-09-15 15:01:49 -04:00
Egor Lakomkin	100cd91db1	Make RotaryPositionalEmbedding jit-compatible (#5237 )	2023-07-07 08:08:01 +02:00
Franz Nowak	0338cdc309	fix imports referencing moved metrics.py file (#4840 ) * fix imports referencing moved metrics.py file * Make representation computation branchless in TransformerEncoderBase (#4818) Summary: We want to make the computation branchless here because fairseq code may be exported and traced for deployment purposes, and tracing mechanisms can break the correctness for a captured program if it's dependent on input data. In this diff we try to rewrite the code to remove one branch so that tracer can proceed here and preserve the correct semantics of the model. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: * Fix Torchscript typing in transformer_encoder.py (#4847) * Add Generative Spoken Dialogue Language Modeling (#4879) * Update deprecated torch.qr in glow.py example (#4685) torch.qr is deprecated for a long time and is being removed by https://github.com/pytorch/pytorch/pull/70989. This PR makes the example compatible with new and old PyTorch versions. * Emotion Conversion Paper Open Source (#4895) * data2vec v2.0 (#4903) data2v2c 2.0 Co-authored-by: Arun Babu <arbabu@fb.com> Co-authored-by: Wei-Ning Hsu <wnhsu@csail.mit.edu> * remove missing config entries when loading task from checkpoint (#4905) * make apex optional (#4906) * Add file to generate manifests for stop dataset. (#4891) * Update STOP dataset README to include proper link. (#4892) * Update README.md (#4893) * using foreach to reduce kernel (#4904) * using foreach to reduce kernel * set reproducibility to looser threshold * revert optimzer * update * update * update * update * update * update * update Co-authored-by: juntengjia <juntengjia@fb.com> * Update README.md to add data2vec blog post (#4913) * Update README.md * Update config to fix circleci failure (#4949) https://app.circleci.com/pipelines/github/fairinternal/fairseq-py/12635/workflows/3befbae2-79c4-458d-9fc4-aad4484183b4/jobs/26767 * Generative Spoken Dialogue Language Modeling Paper Open Source (#4957) * wav2vec2_laser (#4968) * ASR BLEU tool copied from ust branch into main (#4914) * Add transcript option for asr-bleu (#4981) --------- Co-authored-by: zhxchen17 <zhxchen17@outlook.com> Co-authored-by: zhxchen17 <zhxchen17@fb.com> Co-authored-by: Nguyen Tu Anh <nguyentuanh208@gmail.com> Co-authored-by: Sergii Dymchenko <kit1980@gmail.com> Co-authored-by: Felix Kreuk <felixkreuk@gmail.com> Co-authored-by: Alexei Baevski <alexei.b@gmail.com> Co-authored-by: padentomasello <pdtomasello@gmail.com> Co-authored-by: Junteng Jia <juntengjia@hotmail.com> Co-authored-by: juntengjia <juntengjia@fb.com> Co-authored-by: arbabu123 <arbabu@fb.com> Co-authored-by: dianaml0 <82468439+dianaml0@users.noreply.github.com> Co-authored-by: Pierre Andrews <mortimer@fb.com> Co-authored-by: Ilia Kulikov <kulikov@cs.nyu.edu> Co-authored-by: Xutai Ma <xutaima@gmail.com>	2023-02-23 16:18:36 -05:00
Alexei Baevski	d871f6169f	data2vec v2.0 (#4903 ) data2v2c 2.0 Co-authored-by: Arun Babu <arbabu@fb.com> Co-authored-by: Wei-Ning Hsu <wnhsu@csail.mit.edu>	2022-12-12 08:53:56 -08:00
Wei	acd9a53607	update isort (#4568 ) Co-authored-by: dianaml0 <82468439+dianaml0@users.noreply.github.com>	2022-08-01 14:26:36 -07:00
Alexander Jipa	ba415c99ca	add span_masked_lm task (#4366 ) Co-authored-by: Alexander Jipa <azzhipa@amazon.com>	2022-06-29 10:04:00 -04:00
Alexander Jipa	a6a6327942	switch denoising and multilingual_denoising tasks to OmegaConf (#4447 ) Co-authored-by: Alexander Jipa <azzhipa@amazon.com>	2022-06-28 15:44:18 -04:00
Wei Wei	d364fdbb26	Reland BT enablement on fairseq - fairseq change (#4513 ) Summary: Pull Request resolved: https://github.com/facebookresearch/fairseq/pull/4513 With some fixes to torchscript using dual copies. Reland this diff. Reviewed By: erichan1 Differential Revision: D37371293 fbshipit-source-id: 4fcfc4083955b6f5fc4ef8600f1b517b6ba69aae	2022-06-24 19:03:29 -07:00
Ilia Kulikov	5528b6a382	add reading from zip audio to hubert dataset and scripts (#3403 ) Summary: These are changes from: https://github.com/fairinternal/fairseq-py/pull/3310 https://github.com/fairinternal/fairseq-py/pull/3285 which were in ust team branch, now moving them to the main. the main goal is to provide hubert dataset and scripts to read audio from zipped audio storage with backward compatibility depending on the given path. X-link: https://github.com/fairinternal/fairseq-py/pull/3403 Reviewed By: kahne Differential Revision: D37150156 Pulled By: uralik fbshipit-source-id: 7f249b09d7e971c6c7f99114709c26e6a35805cf	2022-06-24 14:09:30 -07:00
Wei Ho	956fcf495b	Back out "BT enablement on fairseq - fairseq change" Summary: Context: https://fburl.com/7vdj7vhl Backing out due to breaking our TorchScript test: ``` RuntimeError: method cannot be used as a value: File "/dev/shm/uid-30041/54641b26-seed-nspid4026533396_cgpid7154327-ns-4026533393/fairseq/modules/transformer_layer.py", line 307 self.in_proj_weight, self.in_proj_bias, self.self_attn.out_proj.weight, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE self.self_attn.out_proj.bias, self.activation_relu_or_gelu == 2, Stack trace: Exception type: torch::jit::ErrorReport ``` https://fburl.com/sandcastle/4pzqemf5 Original commit changeset: 984266f850fc Original Phabricator Diff: D37082681 (`3a757d7ab2`) Differential Revision: D37303846 fbshipit-source-id: 1757ea5dae98be5beb4d08f70b0c3001d6ea336f	2022-06-21 17:27:50 -07:00
Wei Wei	3a757d7ab2	BT enablement on fairseq - fairseq change (#4480 ) Summary: Pull Request resolved: https://github.com/facebookresearch/fairseq/pull/4480 as titled and depends on D36057338 Fork the inference path inside the forward function. If loaded the checkpoint file and perform the inference, we will deploy BT. Otherwise, fairseq take the position. In summary: Accuracy: accuracy loss due to the fp16, the maximum diff is around 0.009. If we set it to fp32, there is no accuracy loss Perf: the current fairseq has similar speed as vanilla version. After the enablement, the speedup is similar to standalone BT test. With batch size=64 For V100, the speedup reaches to 1.23x For A100, the speedup reaches to 1.38x After enable nested tensor, For V100, the speedup reaches to 2.46x Reviewed By: mikekgfb Differential Revision: D37082681 fbshipit-source-id: 984266f850fc30603e48be56e41ac2c67da080f5	2022-06-15 21:48:41 -07:00
Jongsoo Park	e0884db9a7	don't use half precision in test_ema on CPU (#3408 ) Summary: X-link: https://github.com/fairinternal/fairseq-py/pull/3408 Pull Request resolved: https://github.com/facebookresearch/fairseq/pull/4443 To fix errors introduced in D35571505 Reviewed By: ngimel Differential Revision: D36726254 fbshipit-source-id: dde8964c47426839b03c842574669ae9428031c6	2022-05-26 21:14:17 -07:00
Yun Tang	993129dae4	Merge STPT: Step 3 Summary: 1. Add joint pre-training scripts 2. Replace prepend_tgt_lang_tag_no_change with prepend_tgt_lang_tag_as_bos 3. Add readme for the joint pre-training 4. Add test case for the Librispeech model Reviewed By: hygong-fb Differential Revision: D36300953 fbshipit-source-id: cb749689787ed97c1250d122bdefb7f7a2252292	2022-05-10 19:44:00 -07:00
dianaml0	e71c4d04d7	fix broken build and docs (#3362 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? - [x] formatting fix - [x] optional import of xFormers - [x] enabled doc building as part of CI - [x] remove mask arguments for attentions that do not support them - [x] remove masks for blocksparse tests, no longer supported - [ ] use pytest instead of deprecated `setup.py test` - [ ] CircleCI xFormers tests Will submit without the last two done to unblock people using the repo ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � X-link: https://github.com/fairinternal/fairseq-py/pull/3362 Reviewed By: blefaudeux Differential Revision: D36169572 Pulled By: dianaml0 fbshipit-source-id: 3b20ae5f377144a0854e016771af703f0d0d694b	2022-05-05 15:18:53 -07:00
dianaml0	51478ad3a1	xformer integration (#2263 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [x] Did you write any new necessary tests? ## What does this PR do? This PR is a cleaned up version of https://github.com/fairinternal/fairseq-py/issues/2138. It is based on the `main` branch instead of the `gshard` branch. Removed call to xFormers MultiHeadDispatch, only using xFormers Attention. ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � X-link: https://github.com/fairinternal/fairseq-py/pull/2263 Reviewed By: blefaudeux Differential Revision: D33800377 Pulled By: dianaml0 fbshipit-source-id: 658d52214c782212b12881b30c4d908a763b4cf2	2022-05-04 09:15:36 -07:00
Diana Liskovich	0b54d9fb2e	fix formatting (#3350 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � X-link: https://github.com/fairinternal/fairseq-py/pull/3350 Reviewed By: shruti-bh Differential Revision: D36009526 Pulled By: dianaml0 fbshipit-source-id: 9cdc3d53086b8d40a780bcb64cfe28108091ab98	2022-04-28 14:17:09 -07:00
Diana Liskovich	72d3408481	Pull out some code into separate methods (#3068 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Pulling out some changes from https://github.com/fairinternal/fairseq-py/pull/2263 unrelated to xformers to make the PR cleaner ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � X-link: https://github.com/fairinternal/fairseq-py/pull/3068 Reviewed By: blefaudeux Differential Revision: D34149016 Pulled By: dianaml0 fbshipit-source-id: 6442a5f451d56cc47106227298a624516b19a9ad	2022-04-27 16:54:02 -07:00
Alexander Jipa	355ffbe4e2	add masked_lm test (#4344 ) Summary: # Before submitting - [X] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [X] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [X] Did you make sure to update the docs? - [X] Did you write any new necessary tests? ## What does this PR do? Fixes https://github.com/pytorch/fairseq/issues/4300 ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Big time! Note: I had to update `black` because of [this known issue](https://github.com/psf/black/issues/2964): ``` black....................................................................Failed - hook id: black - exit code: 1 Traceback (most recent call last): File "/Users/azzhipa/.cache/pre-commit/repoxt83whf2/py_env-python3.8/bin/black", line 8, in <module> sys.exit(patched_main()) File "/Users/azzhipa/.cache/pre-commit/repoxt83whf2/py_env-python3.8/lib/python3.8/site-packages/black/__init__.py", line 1423, in patched_main patch_click() File "/Users/azzhipa/.cache/pre-commit/repoxt83whf2/py_env-python3.8/lib/python3.8/site-packages/black/__init__.py", line 1409, in patch_click from click import _unicodefun ImportError: cannot import name '_unicodefun' from 'click' (/Users/azzhipa/.cache/pre-commit/repoxt83whf2/py_env-python3.8/lib/python3.8/site-packages/click/__init__.py) ``` Pull Request resolved: https://github.com/pytorch/fairseq/pull/4344 Reviewed By: zhengwy888 Differential Revision: D35691648 Pulled By: dianaml0 fbshipit-source-id: 4bdf408bc9d9cca76c9c08e138cf85b1d00d14d4	2022-04-18 14:47:00 -07:00
spopuri	420136acd2	fix failing convtransformer test (#3107 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3107 Reviewed By: cndn Differential Revision: D34354339 Pulled By: sravyapopuri388 fbshipit-source-id: 50888706123d246c13d2cbb22d0e043740ff6bf5	2022-02-22 11:24:11 -08:00
Sravya Popuri	67eaecd2fc	Add regression test for SimulConvTransformerModel (#3031 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3031 Reviewed By: kahne Differential Revision: D34018108 Pulled By: sravyapopuri388 fbshipit-source-id: 4db96653658a998b15c0cdbc2e588198d951a420	2022-02-16 09:32:21 -08:00
dianaml0	5f2515e676	Fix failing test (#3065 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3065 Reviewed By: Mortimerp9 Differential Revision: D34144674 Pulled By: dianaml0 fbshipit-source-id: 842b0d29c9c85d4b56b640f2823fcb4e3f912f98	2022-02-10 12:17:47 -08:00
Sravya Popuri	8b02f00e8a	fix s2s test - disable multitasking by setting multitask_config_yaml to None (#3059 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3059 Reviewed By: kahne Differential Revision: D34083178 Pulled By: sravyapopuri388 fbshipit-source-id: a33af1696570be4826973b19fe34177bcf851e06	2022-02-09 10:05:22 -08:00
Sravya Popuri	11b2830d29	Refactor speech tests and add missing regression tests (#3001 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3001 Reviewed By: kahne Differential Revision: D33904550 Pulled By: sravyapopuri388 fbshipit-source-id: f55f8121d83e5abebdfcf7ac90dcba39f65cafaf	2022-02-04 14:35:02 -08:00
Vimal Manohar	6b7a7d6457	Fix EMA GPU test Summary: The GPU test was broken after D33809223 (`1b61bbad32`) Reviewed By: cruvadom Differential Revision: D33931570 fbshipit-source-id: 37962a437d8e25b1dafc58db0efa55c1afa5f3ee	2022-02-04 09:10:06 -08:00
Pierre Andrews	f591cc94ca	upgrade black for lints (#3004 ) Summary: This is the same as https://github.com/fairinternal/fairseq-py/issues/3003 but for main instead of gshard. the lint test will run the latest version of black, which is 22.1.0 right now and seems to be incompatible with the 21.12b0 version that is setup in pre-commit. This means that some files were with valid format in the past, but are not anymore... This PR formats these files with 22.1.0 and autoupdates pre-commit config to use that black version too. (note: this is the second time it happens. a solution would be to pin the lint test to the same version as the one in the pre-commit hook and that was used to format everything clean so that we have a stable formating) Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3004 Reviewed By: dianaml0 Differential Revision: D33917490 Pulled By: Mortimerp9 fbshipit-source-id: d55e800b976f94545cdab4132daa7c45cbd0e34c	2022-02-02 04:31:33 -08:00
Vimal Manohar	1b61bbad32	Fix broken EMA in fairseq Summary: EMA broken since D33649708 (`995c204337`) due to indentation error. Reviewed By: cruvadom Differential Revision: D33809223 fbshipit-source-id: c6c4d0d327443bfea787817040e1832eef0f50e4	2022-01-27 13:02:58 -08:00
alexeib	995c204337	Data2vec prelim (#2929 ) Summary: Preliminaries for data2vec release, include some minor improvements and bug fixes Most important change is that we now default to raising an exception when fields in config do not have a corresponding field in the model dataclass Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2929 Reviewed By: wnhsu Differential Revision: D33649708 Pulled By: alexeib fbshipit-source-id: 629bdb4c361550740b451c570c2005bb956c6fcb	2022-01-20 00:02:16 -08:00
Liang Tan	1575f30dd0	Add ffn prune to fairseq Summary: Support FFN prune for Fairseq. For example, user can apply pruning on top of Roberta base model by specify the argument "--ffn-blocks-to-remove 1024". Also, user needs to provide a ckpt which is already pruned so that the pruned ckpt can be loaded correctly. The idea of prune can be summarized as Fine tune model (e.g. roberta encoder) on a certain datasets with regularization After the model is trained. User could use _get_fc_rank and _prune_fc_layer functions to get the top X blocks with most importance in each transformer layer. Then user uses the rank to prune a new roberta encoder and save the pruned ckpt manually. User will fine tune the the new roberta encoder via the ckpt saved above Reviewed By: dianaml0 Differential Revision: D33525055 fbshipit-source-id: 5087140ee891d6ec9266726e3a477947c233412c	2022-01-14 16:26:59 -08:00
Vimal Manohar	cf8ff8c3c5	Add unittests for jitting EMA model Summary: As title Reviewed By: nayansinghal Differential Revision: D32005717 fbshipit-source-id: ebdf1ed0e4a2b9fccffd841d0fa7be0b50ec6b79	2022-01-13 01:53:42 -08:00
Pierre Andrews	279796224f	Preprocess Split (#2738 ) Summary: This is the equivalent to PR https://github.com/fairinternal/fairseq-py/issues/2697 but on top of main instead of gshard (cherry-picked and merged the squash): * reorganize preprocess.py code a bit * use Binarizers objects in the multiprocess code * clean up the make_binary * multiprocess logic * learn to count * format and doc string * add basic test for vocab binarizer * generalize to one line * move multiprocess in binarizer Testing: ``` python -m fairseq_cli.preprocess --only-source --trainpref ~/fixathon/small_vocab_test/train.in --destdir ~/fixathon/small_vocab_test/data-bin.cherry --workers 20 python -m fairseq_cli.preprocess --only-source --trainpref ~/fixathon/small_vocab_test/train.in --destdir ~/fixathon/small_vocab_test/data-bin.main --workers 20 ``` ``` md5sum ~/fixathon/small_vocab_test/data-bin.cherry/train.bin == md5sum ~/fixathon/small_vocab_test/data-bin.main/train.bin ``` ``` diff ~/fixathon/small_vocab_test/data-bin.main/dict.txt ~/fixathon/small_vocab_test/data-bin.cherry/dict.tx ``` Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2738 Reviewed By: sshleifer, dianaml0 Differential Revision: D32830875 Pulled By: Mortimerp9 fbshipit-source-id: e7463d5cdd96a877691bf39666daa319ebb3dcb8	2022-01-11 11:56:46 -08:00
Liang Tan	b3fa5100c6	Add mha prune to fairseq Summary: Support multihead attention prune for Fairseq. For example, user can apply pruning on top of Roberta base model by specify the argument "--mha-heads-to-keep 8". Also, user needs to provide a ckpt which is already pruned so that the pruned ckpt can be loaded correctly. The idea of prune can be summarized as 1. Fine tune model (e.g. roberta encoder) on a certain datasets with regularization 2. After the model is trained. User could use get_reserve_head_index and _adaptive_prune_heads functions to get the top X heads with most importance. Then user uses the rank to prune a new roberta encoder and save the pruned ckpt manually. 3. User will fine tune the the new roberta encoder via the ckpt saved above To get rid of registering different pruned version of Roberta, I use the argument --mha-heads-to-keep to prune the Roberta model into a pruned version which matches the pruned ckpt. Reviewed By: dianaml0 Differential Revision: D32449003 fbshipit-source-id: a952fd9ad723a6dbc5c2af574c42f2e9a1fa27dc	2022-01-11 10:09:07 -08:00
Sravya Popuri	40ff55abbe	conformer (#2859 ) Summary: This PR - Adds conformer layer based on https://arxiv.org/pdf/2005.08100.pdf. - Conformer implementation supports multihead attention based on 3 different positional embedding types - absolute positional embedding, relative positional encoding and rotational positional embedding. - Adds conformer encoder with conv1d subsampling, positional embedding followed by N conformer layers - Adds S2T_Conformer model based on the conformer encoder and transformer decoder. - Add conformer support in Wav2Vec2 - Add unit tests for core modules Verfication - Verified the set up on MUST-C En-De S2T, Covost2 Es-En S2T, Librispeech ASR to ensure the implementation is correct. - For S2T setups, the performance is either similar to the transformer based models or better. - Wav2vec2 pretraining and finetuning based on librispeech showed improvements over corresponding transformer baselines. - [WIP] Experiment log: https://docs.google.com/document/d/1QI-ROWVenUEXPJoHTaKD85Fq7T8ZXNc8bc54MzgwJjA/edit# Next steps - Add regression tests - Add README and open source checkpoints Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2859 Reviewed By: kahne Differential Revision: D33434092 Pulled By: sravyapopuri388 fbshipit-source-id: 62f22b917a332481370750e04a439e05832a2282	2022-01-10 16:18:38 -08:00
Yun Tang	e69f1fa37f	speech integration tests for jointly trained models Summary: Add test for DualInputS2TTransformerModel at examples/speech_text_joint_to_text/models/s2t_dualinputtransformer.py Reviewed By: kahne Differential Revision: D33284188 fbshipit-source-id: c02b697fc7734425661e00bbb606852b5d94a587	2022-01-07 12:45:20 -08:00
Changhan Wang	ee177fc4fa	add xm_transformer test; refactor speech tests Summary: add xm_transformer test; refactor speech tests Reviewed By: sravyapopuri388 Differential Revision: D33312231 fbshipit-source-id: a2b2695fc3c10d5420abbe23a4a3005777aa2ae1	2021-12-31 12:31:11 -08:00
Liang Tan	2762a1cfef	Add regularization for multihead attention module and ffn module Summary: [Fairseq] Add regularization for multihead attention module and ffn module Reviewed By: dianaml0 Differential Revision: D32441521 fbshipit-source-id: c648c1f8ec1a3310ba90c4952cdd40a21b959d26	2021-12-30 02:02:05 -08:00
Diana Liskovich	7fddb9d960	lint fixes (#2834 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Applied `black` and `isort` to fix failing CI ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2834 Reviewed By: vedanuj Differential Revision: D33262876 Pulled By: dianaml0 fbshipit-source-id: 03215c276fcddda9f7c78971bf6ed7c5ac21b2ee	2021-12-29 11:50:55 -08:00
Xian Li	7f3967805f	add readme for xglm models (#2808 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Add readme and task for xglm models. ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2808 Reviewed By: punitkoura Differential Revision: D33237928 Pulled By: xianxl fbshipit-source-id: 7773cf56e896210dab1f4311ae69f0e00c6d9aff	2021-12-20 13:05:17 -08:00
Diana Liskovich	a54021305d	formatting fix (#2816 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? fix `black` failures ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2816 Reviewed By: alexeib Differential Revision: D33172615 Pulled By: dianaml0 fbshipit-source-id: 36b141f42941670f1bfa981041d878042feb0428	2021-12-16 16:11:19 -08:00
Changhan Wang	7b0159a202	add integration test for fastspeech2 Summary: Adding integration test (based on test set scores on pre-trained checkpoints) for fastspeech2 Reviewed By: yuntang Differential Revision: D33143301 fbshipit-source-id: dca0841b43dd1cb2933ce5c652ed3cdff0fc4a52	2021-12-15 16:15:38 -08:00
Changhan Wang	ee833ed49d	speech integration tests (batch 1) Summary: Adding the first batch of speech integration tests (based on test set scores on pre-trained checkpoints) for - S2T transformer - TTS transformer Reviewed By: yuntang Differential Revision: D33050653 fbshipit-source-id: fb5bb9f46e8e17cb705971ca1990c8e1cb99d5f9	2021-12-14 17:42:18 -08:00
Jingfei Du	16ebfa752c	Revert preix beamsearch fix (#2763 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? reverting to fix issue mentioned [here](https://github.com/pytorch/fairseq/issues/3913). Having another PR for fixing the original issue later. ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2763 Reviewed By: myleott Differential Revision: D33000411 Pulled By: jingfeidu fbshipit-source-id: 95a54cbdc612129a0eab4b5e6aa576a5bcf00588	2021-12-14 13:22:09 -08:00
Changhan Wang	8548f1d401	Add loading from HuggingFace Hub Summary: Add loading from HuggingFace Hub. Revised from and to replace D32697723 (accepted). Reviewed By: pipibjc, dianaml0 Differential Revision: D32964041 fbshipit-source-id: 39676aa0ecb10454ae76b70968d5abe96ab6da54	2021-12-10 16:55:12 -08:00
dianaml0	88e7d2586b	fix flake8 issues (#2570 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? - [x] applies flake8 fixes to main branch (https://github.com/fairinternal/fairseq-py/issues/2546) - still more to be fixed Fix GPU tests: - [x] when torch.ao.quantization import doesn't work use torch.quantization - [x] build apex from earlier commit in circleci so that its compatible with pytorch 1.8 and 1.9 ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2570 Reviewed By: Mortimerp9 Differential Revision: D32955312 Pulled By: dianaml0 fbshipit-source-id: e163cbd4998f171f819e31b0682c1c0f1986f9e1	2021-12-09 02:34:30 -08:00
dianaml0	0dfd6b6240	Add linting with black (#2678 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2678 Reviewed By: Mortimerp9 Differential Revision: D32653381 Pulled By: dianaml0 fbshipit-source-id: 2810d14867cd7d64f4d340740e2b590b82de47fe	2021-11-29 12:32:59 -08:00
Sam Shleifer	fb64e43c67	skip remainder batch (#2464 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2464 Reviewed By: myleott Differential Revision: D31742871 Pulled By: sshleifer fbshipit-source-id: e5d29ca9d594abd92212eb24b60c991f2840a4e8	2021-11-24 07:50:50 -08:00
Vinayak Tantia	3a5838c320	Update implemention of SlowMo to its implementation in Fairscale (#3996 ) Summary: - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [x] Did you make sure to update the docs? - [x] Did you write any new necessary tests? ## What does this PR do? SlowMo is being moved to [Fairscale](https://fairscale.readthedocs.io/en/latest/). This commit updates the implementation of SlowMo to the Fairscale version. It also adds tests for SlowMo. Note: This PR is currently for review. It will be merged at a later date once SlowMo has been updated to Fairscale. SlowMo is being merged to Fairscale as part of [a PR](https://github.com/facebookresearch/fairscale/pull/378). So, once that PR is merged to Fairscale, this PR on Fairseq will be ready for merge ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/pytorch/fairseq/pull/3996 Reviewed By: dianaml0 Differential Revision: D32280163 Pulled By: vtantia fbshipit-source-id: 70c97b04a7cdc90ada7099375c2a31b0c978ba70	2021-11-09 09:44:45 -08:00
Sam Shleifer	c5ff181125	NormFormer: flags and docs (#2460 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2460 Reviewed By: myleott Differential Revision: D31731798 Pulled By: sshleifer fbshipit-source-id: 938456c17aa004cacffdcdd124aebe390da83d5f	2021-10-19 17:13:04 -07:00
Vimal Manohar	1ef3d6a1a2	CPLTask for training with continuous pseudo labeling Summary: CPLTaskImpl provides implementation to augment existing tasks to take additional input of ema_model in its train_step and valid_step for continous pseudo-labeling (CPL) during training. It passes this ema_model to the criterion. See Kaizen semi-supervised training paper for more details https://arxiv.org/abs/2106.07759. This implementation also supports using CPLDataset which enables using unsupervised data only for `cpl_finetune_epoch > epochs >= cpl_start_epoch`. CPLDataset is like MultiCorpusDataset but ignores the unsupervised datasets while sampling. Another addition in this diff is to skip dataset in MultiCorpusDataset if the sampling probability is 0. Reviewed By: cruvadom Differential Revision: D30701536 fbshipit-source-id: 1d840eacfd538ed7aed3baaefc8b254390642b45	2021-10-14 22:09:07 -07:00
Vimal Manohar	8feccf9441	EMA Summary: Adds Exponential moving average (EMA) model for Kaizen semi-supervised training https://arxiv.org/abs/2106.07759 1. Add `ema.store_ema` to enable storing EMA. EMA will be written to extra_state in the state dict while saving checkpoint. 2. `ema.ema_start_update` to control when the EMA starts accumulating 3. Tasks can use `uses_ema` property to decide if the EMA should be passed to the task. (Default is False) 4. `load_ema_from_checkpoint` can be used to load EMA model in place of the model to be used for evalutation. Pyspeech has eval-ema option for this. ``` This module has the EMA class used to store a copy of the exponentially decayed model params. Typical usage of EMA class involves initializing an object using an existing model (random or from a seed model) and setting the config like ema_decay, ema_start_update which determine how the EMA model is updated. After every update of the model i.e. at the end of the train_step, the EMA should be updated by passing the new model to the EMA.step function. The EMA model state dict can be stored in the extra state under the key of "ema" and dumped into a checkpoint and loaded. The EMA object can be passed to tasks by setting task.uses_ema property. EMA is a smoothed/ensemble model which might have better performance when used for inference or further fine-tuning. EMA class has a reverse function to load the EMA params into a model and use it like a regular model. ``` Reviewed By: cruvadom Differential Revision: D24238379 fbshipit-source-id: 879d3ba5070a614b7d365f9503af357001e875b2	2021-09-01 12:29:51 -07:00

1 2 3 4 5 ...

309 Commits