Commit Graph

309 Commits

Author SHA1 Message Date
Piyush Kansal
7409af7f9a
Keep task level checkpoint key name generic (#5330) 2023-09-15 19:15:19 -04:00
Piyush Kansal
e29f53bfea
initial revision (#5328) 2023-09-15 15:01:49 -04:00
Egor Lakomkin
100cd91db1
Make RotaryPositionalEmbedding jit-compatible (#5237) 2023-07-07 08:08:01 +02:00
Franz Nowak
0338cdc309
fix imports referencing moved metrics.py file (#4840)
* fix imports referencing moved metrics.py file

* Make representation computation branchless in TransformerEncoderBase (#4818)

Summary:
We want to make the computation branchless here because fairseq code may be
exported and traced for deployment purposes, and tracing mechanisms can
break the correctness for a captured program if it's dependent on input data.
In this diff we try to rewrite the code to remove one branch so that tracer
can proceed here and preserve the correct semantics of the model.

Test Plan:
CI

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix Torchscript typing in transformer_encoder.py (#4847)

* Add Generative Spoken Dialogue Language Modeling (#4879)

* Update deprecated torch.qr in glow.py example (#4685)

torch.qr is deprecated for a long time and is being removed by https://github.com/pytorch/pytorch/pull/70989.

This PR makes the example compatible with new and old PyTorch versions.

* Emotion Conversion Paper Open Source (#4895)

* data2vec v2.0 (#4903)

data2v2c 2.0
Co-authored-by: Arun Babu <arbabu@fb.com>
Co-authored-by: Wei-Ning Hsu <wnhsu@csail.mit.edu>

* remove missing config entries when loading task from checkpoint (#4905)

* make apex optional (#4906)

* Add file to generate manifests for stop dataset. (#4891)

* Update STOP dataset README to include proper link. (#4892)

* Update README.md (#4893)

* using foreach to reduce kernel (#4904)

* using foreach to reduce kernel

* set reproducibility to looser threshold

* revert optimzer

* update

* update

* update

* update

* update

* update

* update

Co-authored-by: juntengjia <juntengjia@fb.com>

* Update README.md to add data2vec blog post (#4913)

* Update README.md

* Update config to fix circleci failure (#4949)

https://app.circleci.com/pipelines/github/fairinternal/fairseq-py/12635/workflows/3befbae2-79c4-458d-9fc4-aad4484183b4/jobs/26767

* Generative Spoken Dialogue Language Modeling Paper Open Source (#4957)

* wav2vec2_laser (#4968)

* ASR BLEU tool copied from ust branch into main (#4914)

* Add transcript option for asr-bleu (#4981)

---------

Co-authored-by: zhxchen17 <zhxchen17@outlook.com>
Co-authored-by: zhxchen17 <zhxchen17@fb.com>
Co-authored-by: Nguyen Tu Anh <nguyentuanh208@gmail.com>
Co-authored-by: Sergii Dymchenko <kit1980@gmail.com>
Co-authored-by: Felix Kreuk <felixkreuk@gmail.com>
Co-authored-by: Alexei Baevski <alexei.b@gmail.com>
Co-authored-by: padentomasello <pdtomasello@gmail.com>
Co-authored-by: Junteng Jia <juntengjia@hotmail.com>
Co-authored-by: juntengjia <juntengjia@fb.com>
Co-authored-by: arbabu123 <arbabu@fb.com>
Co-authored-by: dianaml0 <82468439+dianaml0@users.noreply.github.com>
Co-authored-by: Pierre Andrews <mortimer@fb.com>
Co-authored-by: Ilia Kulikov <kulikov@cs.nyu.edu>
Co-authored-by: Xutai Ma <xutaima@gmail.com>
2023-02-23 16:18:36 -05:00
Alexei Baevski
d871f6169f
data2vec v2.0 (#4903)
data2v2c 2.0
Co-authored-by: Arun Babu <arbabu@fb.com>
Co-authored-by: Wei-Ning Hsu <wnhsu@csail.mit.edu>
2022-12-12 08:53:56 -08:00
Wei
acd9a53607
update isort (#4568)
Co-authored-by: dianaml0 <82468439+dianaml0@users.noreply.github.com>
2022-08-01 14:26:36 -07:00
Alexander Jipa
ba415c99ca
add span_masked_lm task (#4366)
Co-authored-by: Alexander Jipa <azzhipa@amazon.com>
2022-06-29 10:04:00 -04:00
Alexander Jipa
a6a6327942
switch denoising and multilingual_denoising tasks to OmegaConf (#4447)
Co-authored-by: Alexander Jipa <azzhipa@amazon.com>
2022-06-28 15:44:18 -04:00
Wei Wei
d364fdbb26 Reland BT enablement on fairseq - fairseq change (#4513)
Summary:
Pull Request resolved: https://github.com/facebookresearch/fairseq/pull/4513
With some fixes to torchscript using dual copies.
Reland this diff.

Reviewed By: erichan1

Differential Revision: D37371293

fbshipit-source-id: 4fcfc4083955b6f5fc4ef8600f1b517b6ba69aae
2022-06-24 19:03:29 -07:00
Ilia Kulikov
5528b6a382 add reading from zip audio to hubert dataset and scripts (#3403)
Summary:
These are changes from:

https://github.com/fairinternal/fairseq-py/pull/3310
https://github.com/fairinternal/fairseq-py/pull/3285

which were in ust team branch, now moving them to the main.

the main goal is to provide hubert dataset and scripts to read audio from zipped audio storage with backward compatibility depending on the given path.

X-link: https://github.com/fairinternal/fairseq-py/pull/3403

Reviewed By: kahne

Differential Revision: D37150156

Pulled By: uralik

fbshipit-source-id: 7f249b09d7e971c6c7f99114709c26e6a35805cf
2022-06-24 14:09:30 -07:00
Wei Ho
956fcf495b Back out "BT enablement on fairseq - fairseq change"
Summary:
Context: https://fburl.com/7vdj7vhl

Backing out due to breaking our TorchScript test:
```
RuntimeError:
method cannot be used as a value:
  File "/dev/shm/uid-30041/54641b26-seed-nspid4026533396_cgpid7154327-ns-4026533393/fairseq/modules/transformer_layer.py", line 307
                self.in_proj_weight,
                self.in_proj_bias,
                self.self_attn.out_proj.weight,
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
                self.self_attn.out_proj.bias,
                self.activation_relu_or_gelu == 2,

Stack trace:
Exception type: torch::jit::ErrorReport
```
https://fburl.com/sandcastle/4pzqemf5

Original commit changeset: 984266f850fc

Original Phabricator Diff: D37082681 (3a757d7ab2)

Differential Revision: D37303846

fbshipit-source-id: 1757ea5dae98be5beb4d08f70b0c3001d6ea336f
2022-06-21 17:27:50 -07:00
Wei Wei
3a757d7ab2 BT enablement on fairseq - fairseq change (#4480)
Summary:
Pull Request resolved: https://github.com/facebookresearch/fairseq/pull/4480

as titled and depends on D36057338
Fork the inference path inside the forward function. If loaded the checkpoint file and perform the inference, we will deploy BT. Otherwise, fairseq take the position.

In summary:
Accuracy: accuracy loss due to the fp16, the maximum diff is around 0.009. If we set it to fp32, there is no accuracy loss
Perf: the current fairseq has similar speed as vanilla version. After the enablement, the speedup is similar to standalone BT test.
With batch size=64
For V100, the speedup reaches to 1.23x
For A100, the speedup reaches to 1.38x

After enable nested tensor,
For V100, the speedup reaches to 2.46x

Reviewed By: mikekgfb

Differential Revision: D37082681

fbshipit-source-id: 984266f850fc30603e48be56e41ac2c67da080f5
2022-06-15 21:48:41 -07:00
Jongsoo Park
e0884db9a7 don't use half precision in test_ema on CPU (#3408)
Summary:
X-link: https://github.com/fairinternal/fairseq-py/pull/3408

Pull Request resolved: https://github.com/facebookresearch/fairseq/pull/4443

To fix errors introduced in D35571505

Reviewed By: ngimel

Differential Revision: D36726254

fbshipit-source-id: dde8964c47426839b03c842574669ae9428031c6
2022-05-26 21:14:17 -07:00
Yun Tang
993129dae4 Merge STPT: Step 3
Summary:
1. Add joint pre-training scripts
2. Replace prepend_tgt_lang_tag_no_change with prepend_tgt_lang_tag_as_bos
3. Add readme for the joint pre-training
4. Add test case for the Librispeech model

Reviewed By: hygong-fb

Differential Revision: D36300953

fbshipit-source-id: cb749689787ed97c1250d122bdefb7f7a2252292
2022-05-10 19:44:00 -07:00
dianaml0
e71c4d04d7 fix broken build and docs (#3362)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
- [x] formatting fix
- [x] optional import of xFormers
- [x] enabled doc building as part of CI
- [x] remove mask arguments for attentions that do not support them
- [x] remove masks for blocksparse tests, no longer supported
- [ ] use pytest instead of deprecated `setup.py test`
- [ ] CircleCI xFormers tests

Will submit without the last two done to unblock people using the repo

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

X-link: https://github.com/fairinternal/fairseq-py/pull/3362

Reviewed By: blefaudeux

Differential Revision: D36169572

Pulled By: dianaml0

fbshipit-source-id: 3b20ae5f377144a0854e016771af703f0d0d694b
2022-05-05 15:18:53 -07:00
dianaml0
51478ad3a1 xformer integration (#2263)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?

## What does this PR do?
This PR is a cleaned up version of https://github.com/fairinternal/fairseq-py/issues/2138. It is based on the `main` branch instead of the `gshard` branch. Removed call to xFormers MultiHeadDispatch, only using xFormers Attention.

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

X-link: https://github.com/fairinternal/fairseq-py/pull/2263

Reviewed By: blefaudeux

Differential Revision: D33800377

Pulled By: dianaml0

fbshipit-source-id: 658d52214c782212b12881b30c4d908a763b4cf2
2022-05-04 09:15:36 -07:00
Diana Liskovich
0b54d9fb2e fix formatting (#3350)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

X-link: https://github.com/fairinternal/fairseq-py/pull/3350

Reviewed By: shruti-bh

Differential Revision: D36009526

Pulled By: dianaml0

fbshipit-source-id: 9cdc3d53086b8d40a780bcb64cfe28108091ab98
2022-04-28 14:17:09 -07:00
Diana Liskovich
72d3408481 Pull out some code into separate methods (#3068)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Pulling out some changes from https://github.com/fairinternal/fairseq-py/pull/2263 unrelated to xformers to make the PR cleaner

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

X-link: https://github.com/fairinternal/fairseq-py/pull/3068

Reviewed By: blefaudeux

Differential Revision: D34149016

Pulled By: dianaml0

fbshipit-source-id: 6442a5f451d56cc47106227298a624516b19a9ad
2022-04-27 16:54:02 -07:00
Alexander Jipa
355ffbe4e2 add masked_lm test (#4344)
Summary:
# Before submitting

- [X] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [X] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [X] Did you make sure to update the docs?
- [X] Did you write any new necessary tests?

## What does this PR do?
Fixes https://github.com/pytorch/fairseq/issues/4300

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Big time!

Note:
I had to update `black` because of [this known issue](https://github.com/psf/black/issues/2964):
```
black....................................................................Failed
- hook id: black
- exit code: 1
Traceback (most recent call last):
  File "/Users/azzhipa/.cache/pre-commit/repoxt83whf2/py_env-python3.8/bin/black", line 8, in <module>
    sys.exit(patched_main())
  File "/Users/azzhipa/.cache/pre-commit/repoxt83whf2/py_env-python3.8/lib/python3.8/site-packages/black/__init__.py", line 1423, in patched_main
    patch_click()
  File "/Users/azzhipa/.cache/pre-commit/repoxt83whf2/py_env-python3.8/lib/python3.8/site-packages/black/__init__.py", line 1409, in patch_click
    from click import _unicodefun
ImportError: cannot import name '_unicodefun' from 'click' (/Users/azzhipa/.cache/pre-commit/repoxt83whf2/py_env-python3.8/lib/python3.8/site-packages/click/__init__.py)
```

Pull Request resolved: https://github.com/pytorch/fairseq/pull/4344

Reviewed By: zhengwy888

Differential Revision: D35691648

Pulled By: dianaml0

fbshipit-source-id: 4bdf408bc9d9cca76c9c08e138cf85b1d00d14d4
2022-04-18 14:47:00 -07:00
spopuri
420136acd2 fix failing convtransformer test (#3107)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3107

Reviewed By: cndn

Differential Revision: D34354339

Pulled By: sravyapopuri388

fbshipit-source-id: 50888706123d246c13d2cbb22d0e043740ff6bf5
2022-02-22 11:24:11 -08:00
Sravya Popuri
67eaecd2fc Add regression test for SimulConvTransformerModel (#3031)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3031

Reviewed By: kahne

Differential Revision: D34018108

Pulled By: sravyapopuri388

fbshipit-source-id: 4db96653658a998b15c0cdbc2e588198d951a420
2022-02-16 09:32:21 -08:00
dianaml0
5f2515e676 Fix failing test (#3065)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3065

Reviewed By: Mortimerp9

Differential Revision: D34144674

Pulled By: dianaml0

fbshipit-source-id: 842b0d29c9c85d4b56b640f2823fcb4e3f912f98
2022-02-10 12:17:47 -08:00
Sravya Popuri
8b02f00e8a fix s2s test - disable multitasking by setting multitask_config_yaml to None (#3059)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3059

Reviewed By: kahne

Differential Revision: D34083178

Pulled By: sravyapopuri388

fbshipit-source-id: a33af1696570be4826973b19fe34177bcf851e06
2022-02-09 10:05:22 -08:00
Sravya Popuri
11b2830d29 Refactor speech tests and add missing regression tests (#3001)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3001

Reviewed By: kahne

Differential Revision: D33904550

Pulled By: sravyapopuri388

fbshipit-source-id: f55f8121d83e5abebdfcf7ac90dcba39f65cafaf
2022-02-04 14:35:02 -08:00
Vimal Manohar
6b7a7d6457 Fix EMA GPU test
Summary: The GPU test was broken after D33809223 (1b61bbad32)

Reviewed By: cruvadom

Differential Revision: D33931570

fbshipit-source-id: 37962a437d8e25b1dafc58db0efa55c1afa5f3ee
2022-02-04 09:10:06 -08:00
Pierre Andrews
f591cc94ca upgrade black for lints (#3004)
Summary:
This is the same as https://github.com/fairinternal/fairseq-py/issues/3003 but for main instead of gshard.

the lint test will run the latest version of black, which is 22.1.0 right now and seems to be incompatible with the 21.12b0 version that is setup in pre-commit. This means that some files were with valid format in the past, but are not anymore...

This PR formats these files with 22.1.0 and autoupdates pre-commit config to use that black version too.

(note: this is the second time it happens. a solution would be to pin the lint test to the same version as the one in the pre-commit hook and that was used to format everything clean so that we have a stable formating)

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3004

Reviewed By: dianaml0

Differential Revision: D33917490

Pulled By: Mortimerp9

fbshipit-source-id: d55e800b976f94545cdab4132daa7c45cbd0e34c
2022-02-02 04:31:33 -08:00
Vimal Manohar
1b61bbad32 Fix broken EMA in fairseq
Summary: EMA broken since D33649708 (995c204337) due to indentation error.

Reviewed By: cruvadom

Differential Revision: D33809223

fbshipit-source-id: c6c4d0d327443bfea787817040e1832eef0f50e4
2022-01-27 13:02:58 -08:00
alexeib
995c204337 Data2vec prelim (#2929)
Summary:
Preliminaries for data2vec release, include some minor improvements and bug fixes

Most important change is that we now default to raising an exception when fields in config do not have a corresponding field in the model dataclass

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2929

Reviewed By: wnhsu

Differential Revision: D33649708

Pulled By: alexeib

fbshipit-source-id: 629bdb4c361550740b451c570c2005bb956c6fcb
2022-01-20 00:02:16 -08:00
Liang Tan
1575f30dd0 Add ffn prune to fairseq
Summary:
Support FFN prune for Fairseq. For example, user can apply pruning on top of Roberta base model by specify the argument "--ffn-blocks-to-remove 1024". Also, user needs to provide a ckpt which is already pruned so that the pruned ckpt can be loaded correctly.
The idea of prune can be summarized as
Fine tune model (e.g. roberta encoder) on a certain datasets with regularization
After the model is trained. User could use _get_fc_rank and _prune_fc_layer functions to get the top X blocks with most importance in each transformer layer. Then user uses the rank to prune a new roberta encoder and save the pruned ckpt manually.
User will fine tune the the new roberta encoder via the ckpt saved above

Reviewed By: dianaml0

Differential Revision: D33525055

fbshipit-source-id: 5087140ee891d6ec9266726e3a477947c233412c
2022-01-14 16:26:59 -08:00
Vimal Manohar
cf8ff8c3c5 Add unittests for jitting EMA model
Summary: As title

Reviewed By: nayansinghal

Differential Revision: D32005717

fbshipit-source-id: ebdf1ed0e4a2b9fccffd841d0fa7be0b50ec6b79
2022-01-13 01:53:42 -08:00
Pierre Andrews
279796224f Preprocess Split (#2738)
Summary:
This is the equivalent to PR https://github.com/fairinternal/fairseq-py/issues/2697 but on top of main instead of gshard (cherry-picked and merged the squash):

* reorganize preprocess.py code a bit
* use Binarizers objects in the multiprocess code
* clean up the make_binary
* multiprocess logic
* learn to count
* format and doc string
* add basic test for vocab binarizer
* generalize to one line
* move multiprocess in binarizer

Testing:
```
python -m fairseq_cli.preprocess --only-source --trainpref ~/fixathon/small_vocab_test/train.in --destdir ~/fixathon/small_vocab_test/data-bin.cherry --workers 20
python -m fairseq_cli.preprocess --only-source --trainpref ~/fixathon/small_vocab_test/train.in --destdir ~/fixathon/small_vocab_test/data-bin.main --workers 20
```

```
 md5sum ~/fixathon/small_vocab_test/data-bin.cherry/train.bin == md5sum ~/fixathon/small_vocab_test/data-bin.main/train.bin
```

```
diff ~/fixathon/small_vocab_test/data-bin.main/dict.txt ~/fixathon/small_vocab_test/data-bin.cherry/dict.tx
```

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2738

Reviewed By: sshleifer, dianaml0

Differential Revision: D32830875

Pulled By: Mortimerp9

fbshipit-source-id: e7463d5cdd96a877691bf39666daa319ebb3dcb8
2022-01-11 11:56:46 -08:00
Liang Tan
b3fa5100c6 Add mha prune to fairseq
Summary:
Support multihead attention prune for Fairseq. For example, user can apply pruning on top of Roberta base model by specify the argument "--mha-heads-to-keep 8". Also, user needs to provide a ckpt which is already pruned so that the pruned ckpt can be loaded correctly.

The idea of prune can be summarized as
1. Fine tune model (e.g. roberta encoder) on a certain datasets with regularization
2. After the model is trained. User could use get_reserve_head_index and _adaptive_prune_heads functions to get the top X heads with most importance. Then user uses the rank to prune a new roberta encoder and save the pruned ckpt manually.
3. User will fine tune the the new roberta encoder via the ckpt saved above

To get rid of registering different pruned version of Roberta, I use the argument --mha-heads-to-keep to prune the Roberta model into a pruned version which matches the pruned ckpt.

Reviewed By: dianaml0

Differential Revision: D32449003

fbshipit-source-id: a952fd9ad723a6dbc5c2af574c42f2e9a1fa27dc
2022-01-11 10:09:07 -08:00
Sravya Popuri
40ff55abbe conformer (#2859)
Summary:
**This PR**

- Adds conformer layer based on https://arxiv.org/pdf/2005.08100.pdf.
- Conformer implementation supports multihead attention based on 3 different positional embedding types - absolute positional embedding, relative positional encoding  and rotational positional embedding.
- Adds conformer encoder with conv1d subsampling, positional embedding followed by N conformer layers
- Adds S2T_Conformer model based on the conformer encoder and transformer decoder.
- Add conformer support in Wav2Vec2
- Add unit tests for core modules

**Verfication**

- Verified the set up on MUST-C En-De S2T, Covost2 Es-En S2T, Librispeech ASR to ensure the implementation is correct.
- For S2T setups, the performance is either similar to the transformer based models or better.
- Wav2vec2 pretraining and finetuning based on librispeech showed improvements over corresponding transformer baselines.
- [WIP] Experiment log: https://docs.google.com/document/d/1QI-ROWVenUEXPJoHTaKD85Fq7T8ZXNc8bc54MzgwJjA/edit#

**Next steps**
- Add regression tests
- Add README and open source checkpoints

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2859

Reviewed By: kahne

Differential Revision: D33434092

Pulled By: sravyapopuri388

fbshipit-source-id: 62f22b917a332481370750e04a439e05832a2282
2022-01-10 16:18:38 -08:00
Yun Tang
e69f1fa37f speech integration tests for jointly trained models
Summary: Add test for DualInputS2TTransformerModel at examples/speech_text_joint_to_text/models/s2t_dualinputtransformer.py

Reviewed By: kahne

Differential Revision: D33284188

fbshipit-source-id: c02b697fc7734425661e00bbb606852b5d94a587
2022-01-07 12:45:20 -08:00
Changhan Wang
ee177fc4fa add xm_transformer test; refactor speech tests
Summary: add xm_transformer test; refactor speech tests

Reviewed By: sravyapopuri388

Differential Revision: D33312231

fbshipit-source-id: a2b2695fc3c10d5420abbe23a4a3005777aa2ae1
2021-12-31 12:31:11 -08:00
Liang Tan
2762a1cfef Add regularization for multihead attention module and ffn module
Summary: [Fairseq] Add regularization for multihead attention module and ffn module

Reviewed By: dianaml0

Differential Revision: D32441521

fbshipit-source-id: c648c1f8ec1a3310ba90c4952cdd40a21b959d26
2021-12-30 02:02:05 -08:00
Diana Liskovich
7fddb9d960 lint fixes (#2834)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Applied `black` and `isort` to fix failing CI

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2834

Reviewed By: vedanuj

Differential Revision: D33262876

Pulled By: dianaml0

fbshipit-source-id: 03215c276fcddda9f7c78971bf6ed7c5ac21b2ee
2021-12-29 11:50:55 -08:00
Xian Li
7f3967805f add readme for xglm models (#2808)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Add readme and task for xglm models.

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2808

Reviewed By: punitkoura

Differential Revision: D33237928

Pulled By: xianxl

fbshipit-source-id: 7773cf56e896210dab1f4311ae69f0e00c6d9aff
2021-12-20 13:05:17 -08:00
Diana Liskovich
a54021305d formatting fix (#2816)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
fix `black` failures

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2816

Reviewed By: alexeib

Differential Revision: D33172615

Pulled By: dianaml0

fbshipit-source-id: 36b141f42941670f1bfa981041d878042feb0428
2021-12-16 16:11:19 -08:00
Changhan Wang
7b0159a202 add integration test for fastspeech2
Summary: Adding integration test (based on test set scores on pre-trained checkpoints) for fastspeech2

Reviewed By: yuntang

Differential Revision: D33143301

fbshipit-source-id: dca0841b43dd1cb2933ce5c652ed3cdff0fc4a52
2021-12-15 16:15:38 -08:00
Changhan Wang
ee833ed49d speech integration tests (batch 1)
Summary:
Adding the first batch of speech integration tests (based on test set scores on pre-trained checkpoints) for
- S2T transformer
- TTS transformer

Reviewed By: yuntang

Differential Revision: D33050653

fbshipit-source-id: fb5bb9f46e8e17cb705971ca1990c8e1cb99d5f9
2021-12-14 17:42:18 -08:00
Jingfei Du
16ebfa752c Revert preix beamsearch fix (#2763)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
reverting to fix issue mentioned [here](https://github.com/pytorch/fairseq/issues/3913). Having another PR for fixing the original issue later.

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2763

Reviewed By: myleott

Differential Revision: D33000411

Pulled By: jingfeidu

fbshipit-source-id: 95a54cbdc612129a0eab4b5e6aa576a5bcf00588
2021-12-14 13:22:09 -08:00
Changhan Wang
8548f1d401 Add loading from HuggingFace Hub
Summary: Add loading from HuggingFace Hub. Revised from and to replace D32697723 (accepted).

Reviewed By: pipibjc, dianaml0

Differential Revision: D32964041

fbshipit-source-id: 39676aa0ecb10454ae76b70968d5abe96ab6da54
2021-12-10 16:55:12 -08:00
dianaml0
88e7d2586b fix flake8 issues (#2570)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
- [x] applies flake8 fixes to main branch (https://github.com/fairinternal/fairseq-py/issues/2546) - still more to be fixed

Fix GPU tests:
- [x] when torch.ao.quantization import doesn't work use torch.quantization
- [x] build apex from earlier commit in circleci so that its compatible with pytorch 1.8 and 1.9

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2570

Reviewed By: Mortimerp9

Differential Revision: D32955312

Pulled By: dianaml0

fbshipit-source-id: e163cbd4998f171f819e31b0682c1c0f1986f9e1
2021-12-09 02:34:30 -08:00
dianaml0
0dfd6b6240 Add linting with black (#2678)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2678

Reviewed By: Mortimerp9

Differential Revision: D32653381

Pulled By: dianaml0

fbshipit-source-id: 2810d14867cd7d64f4d340740e2b590b82de47fe
2021-11-29 12:32:59 -08:00
Sam Shleifer
fb64e43c67 skip remainder batch (#2464)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2464

Reviewed By: myleott

Differential Revision: D31742871

Pulled By: sshleifer

fbshipit-source-id: e5d29ca9d594abd92212eb24b60c991f2840a4e8
2021-11-24 07:50:50 -08:00
Vinayak Tantia
3a5838c320 Update implemention of SlowMo to its implementation in Fairscale (#3996)
Summary:
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [x] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?

## What does this PR do?
SlowMo is being moved to [Fairscale](https://fairscale.readthedocs.io/en/latest/). This commit updates the implementation of SlowMo to the Fairscale version. It also adds tests for SlowMo.
Note: This PR is currently for review. It will be merged at a later date once SlowMo has been updated to Fairscale. SlowMo is being merged to Fairscale as part of [a PR](https://github.com/facebookresearch/fairscale/pull/378). So, once that PR is merged to Fairscale, this PR on Fairseq will be ready for merge

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/pytorch/fairseq/pull/3996

Reviewed By: dianaml0

Differential Revision: D32280163

Pulled By: vtantia

fbshipit-source-id: 70c97b04a7cdc90ada7099375c2a31b0c978ba70
2021-11-09 09:44:45 -08:00
Sam Shleifer
c5ff181125 NormFormer: flags and docs (#2460)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2460

Reviewed By: myleott

Differential Revision: D31731798

Pulled By: sshleifer

fbshipit-source-id: 938456c17aa004cacffdcdd124aebe390da83d5f
2021-10-19 17:13:04 -07:00
Vimal Manohar
1ef3d6a1a2 CPLTask for training with continuous pseudo labeling
Summary:
CPLTaskImpl provides implementation to augment existing tasks to take additional input of ema_model in its train_step and valid_step for continous pseudo-labeling (CPL) during training. It passes this ema_model to the criterion.

See Kaizen semi-supervised training paper for more details https://arxiv.org/abs/2106.07759.

This implementation also supports using CPLDataset which enables using unsupervised data only for `cpl_finetune_epoch > epochs >= cpl_start_epoch`. CPLDataset is like MultiCorpusDataset but ignores the unsupervised datasets while sampling.

Another addition in this diff is to skip dataset in MultiCorpusDataset if the sampling probability is 0.

Reviewed By: cruvadom

Differential Revision: D30701536

fbshipit-source-id: 1d840eacfd538ed7aed3baaefc8b254390642b45
2021-10-14 22:09:07 -07:00
Vimal Manohar
8feccf9441 EMA
Summary:
Adds Exponential moving average (EMA) model for Kaizen semi-supervised training https://arxiv.org/abs/2106.07759

1. Add `ema.store_ema` to enable storing EMA. EMA will be written to extra_state in the state dict while saving checkpoint.
2. `ema.ema_start_update` to control when the EMA starts accumulating
3. Tasks can use `uses_ema` property to decide if the EMA should be passed to the task. (Default is False)
4. `load_ema_from_checkpoint` can be used to load EMA model in place of the model to be used for evalutation. Pyspeech has eval-ema option for this.

```
This module has the EMA class used to store a copy of the exponentially decayed
model params.

Typical usage of EMA class involves initializing an object using an existing
model (random or from a seed model) and setting the config like ema_decay,
ema_start_update which determine how the EMA model is updated. After every
update of the model i.e. at the end of the train_step, the EMA should be updated
by passing the new model to the EMA.step function. The EMA model state dict
can be stored in the extra state under the key of "ema" and dumped
into a checkpoint and loaded. The EMA object can be passed to tasks
by setting task.uses_ema property.
EMA is a smoothed/ensemble model which might have better performance
when used for inference or further fine-tuning. EMA class has a
reverse function to load the EMA params into a model and use it
like a regular model.
```

Reviewed By: cruvadom

Differential Revision: D24238379

fbshipit-source-id: 879d3ba5070a614b7d365f9503af357001e875b2
2021-09-01 12:29:51 -07:00