Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3065
Reviewed By: Mortimerp9
Differential Revision: D34144674
Pulled By: dianaml0
fbshipit-source-id: 842b0d29c9c85d4b56b640f2823fcb4e3f912f98
Summary:
The only difference with plain list/dict now is that nn.Parameters are
handled specially and registered as parameters properly.
test_nn and parametrization works locally.
Will see in CI if DP is fixed as well.
Tentative fix for https://github.com/pytorch/pytorch/issues/36035
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70499
Reviewed By: jbschlosser, alexeib
Differential Revision: D34005332
Pulled By: albanD
fbshipit-source-id: 7e76b0873d0fec345cb537e2a6ecba0258e662b9
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3059
Reviewed By: kahne
Differential Revision: D34083178
Pulled By: sravyapopuri388
fbshipit-source-id: a33af1696570be4826973b19fe34177bcf851e06
Summary:
ema.py initially used by data2vec was actually created for trainer-level ema tracking
since data2vec creates and uses ema tracking within the model, we will split ema into a different module-level implementation
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3036
Reviewed By: wnhsu
Differential Revision: D34034479
Pulled By: alexeib
fbshipit-source-id: f8c65552d446f1104c36380f5d1ff22a75e6e405
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3001
Reviewed By: kahne
Differential Revision: D33904550
Pulled By: sravyapopuri388
fbshipit-source-id: f55f8121d83e5abebdfcf7ac90dcba39f65cafaf
Summary: The GPU test was broken after D33809223 (1b61bbad32)
Reviewed By: cruvadom
Differential Revision: D33931570
fbshipit-source-id: 37962a437d8e25b1dafc58db0efa55c1afa5f3ee
Summary:
## PR review
1. Update HuBERT to work with the TransformerEncoder wav2vec2.py
2. Remove dictionary loading issue when loading fine-tuned HuBERT checkpoints to make the checkpoints self-contained
3. Add unit-test for HuBERT fine-tuned checkpoints
4. Avoid divide-by-zero error in infer.py when inference time is zero (e.g., when inferring just one utterance)
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3019
Reviewed By: andrewyeh
Differential Revision: D33970620
Pulled By: wnhsu
fbshipit-source-id: c523dd6ddb0f6a496be8b0b4b56f0c32c1d3dbc5
Summary:
This is the same as https://github.com/fairinternal/fairseq-py/issues/3003 but for main instead of gshard.
the lint test will run the latest version of black, which is 22.1.0 right now and seems to be incompatible with the 21.12b0 version that is setup in pre-commit. This means that some files were with valid format in the past, but are not anymore...
This PR formats these files with 22.1.0 and autoupdates pre-commit config to use that black version too.
(note: this is the second time it happens. a solution would be to pin the lint test to the same version as the one in the pre-commit hook and that was used to format everything clean so that we have a stable formating)
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3004
Reviewed By: dianaml0
Differential Revision: D33917490
Pulled By: Mortimerp9
fbshipit-source-id: d55e800b976f94545cdab4132daa7c45cbd0e34c
Summary:
## What does this PR do?
Default values for the configs imported from `user_dir` was not added properly.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3007
Reviewed By: alexeib
Differential Revision: D33926315
Pulled By: wnhsu
fbshipit-source-id: 914eecec769964686342d66c96d6ba76f12e1277
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/4172
Reviewed By: punitkoura
Differential Revision: D33911169
Pulled By: todpole3
fbshipit-source-id: d3e111ab4b9a646e1799ad9335c70ec1ee8d25a4
Summary: EMA broken since D33649708 (995c204337) due to indentation error.
Reviewed By: cruvadom
Differential Revision: D33809223
fbshipit-source-id: c6c4d0d327443bfea787817040e1832eef0f50e4
Summary:
## What does this PR do?
- Add unit test for HuBERT
- update model arg to comply with wav2vec to TranformerEncoder
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2766
Reviewed By: Abdel-rahmanMohamed
Differential Revision: D32965218
Pulled By: wnhsu
fbshipit-source-id: 036a1644179c35b875c9ba30d75b4ef039fb328f
Summary:
Preliminaries for data2vec release, include some minor improvements and bug fixes
Most important change is that we now default to raising an exception when fields in config do not have a corresponding field in the model dataclass
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2929
Reviewed By: wnhsu
Differential Revision: D33649708
Pulled By: alexeib
fbshipit-source-id: 629bdb4c361550740b451c570c2005bb956c6fcb
Summary:
Add scripts for multihead attention selection in multilingual and multil-domain training from the following paper:
"Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling", NeurIPS 2021.
Reviewed By: yuntang
Differential Revision: D31802221
fbshipit-source-id: 8c69b89bda29e6857bd3af02979c07e1b5cf49f1
Summary: Add option to use the EMA model for decoding in transducer IPL recipe by passing --ipl-decode-ema. Note EMA should be enabled as in the diff D24238379 (8feccf9441) using options --store-ema --ema-start-update and --ema-decay.
Reviewed By: cruvadom
Differential Revision: D31983366
fbshipit-source-id: 2bf63b3f7d1b5fa8804b3a7e9bfab71a463ca957
Summary:
Add scripts for multihead attention selection in multilingual and multil-domain training from the following paper:
"Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling", NeurIPS 2021.
Reviewed By: yuntang
Differential Revision: D31781212
fbshipit-source-id: 8e1a596826f682f80730c251ec31c68df0de6516
Summary:
Support FFN prune for Fairseq. For example, user can apply pruning on top of Roberta base model by specify the argument "--ffn-blocks-to-remove 1024". Also, user needs to provide a ckpt which is already pruned so that the pruned ckpt can be loaded correctly.
The idea of prune can be summarized as
Fine tune model (e.g. roberta encoder) on a certain datasets with regularization
After the model is trained. User could use _get_fc_rank and _prune_fc_layer functions to get the top X blocks with most importance in each transformer layer. Then user uses the rank to prune a new roberta encoder and save the pruned ckpt manually.
User will fine tune the the new roberta encoder via the ckpt saved above
Reviewed By: dianaml0
Differential Revision: D33525055
fbshipit-source-id: 5087140ee891d6ec9266726e3a477947c233412c
Summary:
# Before submitting
- [ x ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ x ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ x ] Did you make sure to update the docs?
- [ x ] Did you write any new necessary tests?
## What does this PR do?
Update commands, checkpoints and contact info.
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/4129
Reviewed By: dianaml0
Differential Revision: D33556233
Pulled By: shruti-bh
fbshipit-source-id: 3bad45b3e154fa11d4b13776d97408ce1a166113
Summary:
This is the equivalent to PR https://github.com/fairinternal/fairseq-py/issues/2697 but on top of main instead of gshard (cherry-picked and merged the squash):
* reorganize preprocess.py code a bit
* use Binarizers objects in the multiprocess code
* clean up the make_binary
* multiprocess logic
* learn to count
* format and doc string
* add basic test for vocab binarizer
* generalize to one line
* move multiprocess in binarizer
Testing:
```
python -m fairseq_cli.preprocess --only-source --trainpref ~/fixathon/small_vocab_test/train.in --destdir ~/fixathon/small_vocab_test/data-bin.cherry --workers 20
python -m fairseq_cli.preprocess --only-source --trainpref ~/fixathon/small_vocab_test/train.in --destdir ~/fixathon/small_vocab_test/data-bin.main --workers 20
```
```
md5sum ~/fixathon/small_vocab_test/data-bin.cherry/train.bin == md5sum ~/fixathon/small_vocab_test/data-bin.main/train.bin
```
```
diff ~/fixathon/small_vocab_test/data-bin.main/dict.txt ~/fixathon/small_vocab_test/data-bin.cherry/dict.tx
```
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2738
Reviewed By: sshleifer, dianaml0
Differential Revision: D32830875
Pulled By: Mortimerp9
fbshipit-source-id: e7463d5cdd96a877691bf39666daa319ebb3dcb8
Summary:
Support multihead attention prune for Fairseq. For example, user can apply pruning on top of Roberta base model by specify the argument "--mha-heads-to-keep 8". Also, user needs to provide a ckpt which is already pruned so that the pruned ckpt can be loaded correctly.
The idea of prune can be summarized as
1. Fine tune model (e.g. roberta encoder) on a certain datasets with regularization
2. After the model is trained. User could use get_reserve_head_index and _adaptive_prune_heads functions to get the top X heads with most importance. Then user uses the rank to prune a new roberta encoder and save the pruned ckpt manually.
3. User will fine tune the the new roberta encoder via the ckpt saved above
To get rid of registering different pruned version of Roberta, I use the argument --mha-heads-to-keep to prune the Roberta model into a pruned version which matches the pruned ckpt.
Reviewed By: dianaml0
Differential Revision: D32449003
fbshipit-source-id: a952fd9ad723a6dbc5c2af574c42f2e9a1fa27dc
Summary:
- The goal of this framework is to support benchmarking various speech to speech translation(S2ST) models in terms of runtime, max-memory consumption and total number of floating point operations(FLOPS).
- It is a generic framework and can be easily extended to support any fairseq models. To accurately benchmark the performance, core inference modules are re-implemented based on fairseq_cli/generate.py (core.py/Processing) and examples/speech_to_text/generate_waveform.py(core.py/SpeechGeneration.
- To ensure that the end to end models and cascaded models are compared fairly, for cascaded models we only consider the performance metrics for model inference at all stages ignoring any intermediate data and io processing consumption.
- We run all the benchmarking runs on CPU as it is generally used in production environment and also due to lack of good benchmarking library support for GPUs.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2852
Reviewed By: an918tw
Differential Revision: D33398060
Pulled By: sravyapopuri388
fbshipit-source-id: cffa19820deaa4ee7f629845944cbb6223498f4d
Summary:
**This PR**
- Adds conformer layer based on https://arxiv.org/pdf/2005.08100.pdf.
- Conformer implementation supports multihead attention based on 3 different positional embedding types - absolute positional embedding, relative positional encoding and rotational positional embedding.
- Adds conformer encoder with conv1d subsampling, positional embedding followed by N conformer layers
- Adds S2T_Conformer model based on the conformer encoder and transformer decoder.
- Add conformer support in Wav2Vec2
- Add unit tests for core modules
**Verfication**
- Verified the set up on MUST-C En-De S2T, Covost2 Es-En S2T, Librispeech ASR to ensure the implementation is correct.
- For S2T setups, the performance is either similar to the transformer based models or better.
- Wav2vec2 pretraining and finetuning based on librispeech showed improvements over corresponding transformer baselines.
- [WIP] Experiment log: https://docs.google.com/document/d/1QI-ROWVenUEXPJoHTaKD85Fq7T8ZXNc8bc54MzgwJjA/edit#
**Next steps**
- Add regression tests
- Add README and open source checkpoints
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2859
Reviewed By: kahne
Differential Revision: D33434092
Pulled By: sravyapopuri388
fbshipit-source-id: 62f22b917a332481370750e04a439e05832a2282
Summary: Add test for DualInputS2TTransformerModel at examples/speech_text_joint_to_text/models/s2t_dualinputtransformer.py
Reviewed By: kahne
Differential Revision: D33284188
fbshipit-source-id: c02b697fc7734425661e00bbb606852b5d94a587
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Applies `black` and `isort` to files
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2860
Reviewed By: Mortimerp9
Differential Revision: D33456637
Pulled By: dianaml0
fbshipit-source-id: 560b8d3a8f589cbecc92d0d21163596b5d47d609
Summary:
Update xm_transformer
- Added V1 arch (FFNs before/after convolutions in the adaptor, which didn't exist in the V0/ACL paper arch)
- Added args for gradient checkpointing and fully sharded data parallele
Reviewed By: sravyapopuri388
Differential Revision: D33144404
fbshipit-source-id: 548c917824ebd2aa926c83d5ba62fbf648cf4b97
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Applied `black` and `isort` to fix failing CI
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2834
Reviewed By: vedanuj
Differential Revision: D33262876
Pulled By: dianaml0
fbshipit-source-id: 03215c276fcddda9f7c78971bf6ed7c5ac21b2ee
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Releasing code, model & recipe for the work "Direct speech-to-speech translation with discrete units".
Main changes:
1. examples/speech_to_speech
2. tasks/speech_to_speech
3. data/audio/speech_to_speech_dataset
4. models/speech_to_speech
5. criterions/speech_to_speech_criterion
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2756
Reviewed By: sravyapopuri388, kahne
Differential Revision: D32923969
Pulled By: an918tw
fbshipit-source-id: 838ba42457f4684e9767d15b5b514681a9572b39
Summary:
update ignore_prefix_size in label_smoothed_cross_entropy
- lprobs is always B x T x C in the current models
- lprobs.batch_first was default to `False` which contradicts the fact above
Reviewed By: sravyapopuri388
Differential Revision: D33304121
fbshipit-source-id: 9391b48c7036642d9741d254b03c46389a4fe584
Summary:
# Before submitting
- [X] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [X] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes https://github.com/pytorch/fairseq/issues/3882
Fixes https://github.com/pytorch/fairseq/issues/3884
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/3887
Reviewed By: yuntang
Differential Revision: D33152073
Pulled By: kahne
fbshipit-source-id: 7f5c90a9876320e7c5c406ed032681452c7c5056
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Add readme and task for xglm models.
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2808
Reviewed By: punitkoura
Differential Revision: D33237928
Pulled By: xianxl
fbshipit-source-id: 7773cf56e896210dab1f4311ae69f0e00c6d9aff
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
fix `black` failures
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2816
Reviewed By: alexeib
Differential Revision: D33172615
Pulled By: dianaml0
fbshipit-source-id: 36b141f42941670f1bfa981041d878042feb0428
Summary: When under TorchScript Tracing (instead of only doing this for Scripting) we set `export=True` for `LayerNorm` as `FusedLayerNorm `doesn't work with JIT yet (see `torch.jit.unused decorator`).
Reviewed By: cndn
Differential Revision: D33103054
fbshipit-source-id: f8c24a4a30a89dd4c70b19362fd60c51fcb9a1f0