Commit Graph

2326 Commits

Author SHA1 Message Date
Brian Yan
018621f3cc
update paper link + bug fix (#5547) 2024-10-03 08:50:44 -07:00
Brian Yan
c2145111e7
Add LID rerank for MMS (#5545)
* init lid rerank

* init lid rerank

* add greedy ctc score
2024-09-26 19:13:41 -07:00
Vineel Pratap
920a548ca7
Create README.md (#5529)
MMS Zero-shot release
2024-07-22 10:17:18 +02:00
Jon Janzen
d9a627082f
Create depreview.yml (#5501) 2024-05-30 14:44:49 -07:00
Jon Janzen
bedb259bf3
Delete .circleci directory (#5458) 2024-03-13 06:24:17 -07:00
Jiatong
34973a94d0
Multires hubert (#5363)
* multires hubert core

* update core codebase on multiresolution hubert

* add examples

* adding entries to pretrained models (not finished)

* add other abalation models

* add multilinugal

* add decode.sh train.sh finetune.sh and update links for README.md

* fix readme

* clean the codebase

---------

Co-authored-by: Anna Sun <13106449+annasun28@users.noreply.github.com>
2024-02-26 12:15:44 -08:00
Raphaël Merx
3f0f20f2d1
MMS alignment README fixes (#5432)
* Mention sox install through apt, on top of the Python wrapper
* Fix argument name in example command
2024-01-24 10:54:38 -08:00
Yoach Lacombe
fad2c4d1eb
Update README.md (#5407) 2024-01-08 14:38:14 -08:00
Can Balioglu
da8fb63088
Change Meta AI to FAIR (#5346) 2023-10-10 13:36:53 -04:00
Junteng Jia
c7c478b92f
fix iterator when loading from checkpoint (#5344)
Co-authored-by: Junteng Jia <juntengjia@fb.com>
2023-10-09 14:13:06 -07:00
Piyush Kansal
7409af7f9a
Keep task level checkpoint key name generic (#5330) 2023-09-15 19:15:19 -04:00
Piyush Kansal
e29f53bfea
initial revision (#5328) 2023-09-15 15:01:49 -04:00
Vineel Pratap
b5d89cddc9
Update align_and_segment.py (#5317)
Fix MMS alignment code
2023-09-07 11:25:28 -07:00
Nguyen Tu Anh
4db264940f
Add batchnorm option to hubert/wav2vec2 positional convolution layer for hubert bf16 models (#5285)
* add conv_batch_norm for hubert to support bf16

* linting

Co-authored-by: Bowen Shi <bshi@meta.com>
2023-08-18 17:10:40 +02:00
Egor Lakomkin
100cd91db1
Make RotaryPositionalEmbedding jit-compatible (#5237) 2023-07-07 08:08:01 +02:00
Yun Wang (Maigo)
31fba013a0
Register weights as a non-persistent buffer of SinusoidalPositionalEmbedding (#5213) 2023-06-23 13:31:52 -04:00
Patrick von Platen
a29952ce6d
Update README.md (#5211) 2023-06-20 10:15:05 -07:00
Andros Tjandra
8deb43af8c
add new instructions on how to get manifest *.tsv file (#5207)
Co-authored-by: Andros Tjandra <androstj@fb.com>
2023-06-17 16:43:16 -04:00
Vineel Pratap
91c364b7ce
Update MMS README.md (#5202) 2023-06-14 07:09:42 -07:00
Patrick von Platen
b2d5b78ffe
Add transformers MMS checkpoints to docs (#5186)
* Add transformers MMS checkpoints to docs

* Apply suggestions from code review

* Apply suggestions from code review

* Update examples/mms/README.md

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-06-03 21:35:26 -07:00
Andros Tjandra
456ffcfe4c
fix missing extra args in ConformerLayer (#5176)
* fix missing extra args in ConformerLayer

* fix extra args issue

---------

Co-authored-by: Andros Tjandra <androstj@fb.com>
2023-05-31 13:03:41 -04:00
chevalierNoir
533644c3fb
MMS TTS Romanian char fix + MPS support + full checkpoint (#5168)
* Fix ț filtering in Romanian at inference

* mps support + full checkpoints (discriminator+optimizer)

---------

Co-authored-by: Bowen Shi <bshi@meta.com>
2023-05-30 11:05:01 -07:00
chevalierNoir
ae59bd6d04
MMS TTS Colab Notebook (#5165)
* Add TTS Colab notebook


---------

Co-authored-by: Bowen Shi <bshi@meta.com>
2023-05-26 17:43:15 -04:00
Serkan Dayıcık
b30980349b
Feature: Implement UTF-8 readfile support in MMS for TTS example (#5148) 2023-05-25 18:51:56 -07:00
Vineel Pratap
35641fb41e
[MMS] Add a tutorial on CC LM decoding for ASR model (#5160)
* Update MMS_ASR_Inference_Colab.ipynb

* Update mms_infer.py
2023-05-25 18:50:46 -07:00
Vineel Pratap
68c52f52c6
[MMS] Create Colab notebook for LID inference (#5157)
* [MMS] Create Colab Notebook for LID task

* Update README.md

* Update README.md
2023-05-25 11:30:14 -07:00
Andros Tjandra
478787b95a
Create MMS_ASR_Inference_Colab.ipynb (#5151)
* Create MMS_ASR_Inference_Colab.ipynb

Added tutorial in Google Colab IPYNB fashion with small modification. Credit to epk2112 https://github.com/epk2112/fairseq_meta_mms_Google_Colab_implementation

* Add readme & ipynb

* Add readme & ipynb

* change colab hyperlink

---------

Co-authored-by: Andros Tjandra <androstj@fb.com>
2023-05-25 11:08:09 -07:00
Andros Tjandra
25c20e6a5e
Fix wrong input-output ASR input utts order (#5149)
Co-authored-by: Andros Tjandra <androstj@fb.com>
2023-05-24 11:17:46 -07:00
Kirill
b50b649357
Fix the MMS doc about LID manifest (#5144) 2023-05-24 06:09:01 -07:00
chevalierNoir
fea3361112
[MMS] TTS text uromanization + cpu inference (#5140)
* mms tts uroman + cpu support for inference

* remove mps support to accommodate all pytorch versions

* add explanation to arg

---------

Co-authored-by: Bowen Shi <bshi@meta.com>
2023-05-24 00:41:38 -04:00
Vineel Pratap
1082b61b12
[MMS] Update README.md (#5137)
Add dictionary files for MMS ASR models
2023-05-23 14:38:51 -07:00
Vineel Pratap
bc8e8b1251
MMS: Fix forced alignment API usage (#5138) 2023-05-23 13:52:57 -07:00
Vineel Pratap
87d3005630
[MMS] Fix concatenation of emissions (#5133) 2023-05-23 11:35:05 -07:00
Vineel Pratap
af12c9c640
Update README.md (#5118) 2023-05-22 14:16:36 -07:00
Vineel Pratap
c7bfa9bbc3
Update MMS README (#5115) 2023-05-22 12:40:22 -07:00
Vineel Pratap
aec128cb70
Update blog post link for MMS (#5114)
* Update blog post link for MMS

* Update blog post link for MMS
2023-05-22 12:29:19 -07:00
Vineel Pratap
728b947019
Mms release (#3948) (#5110) 2023-05-21 21:15:50 -07:00
Vineel Pratap
bfd9dc6d27
fix dict order (#5109) 2023-05-18 18:29:26 -07:00
Victoria X Lin
5ecbbf58d6
update XStoryCloze dataset description (#5100) 2023-05-08 17:31:22 -04:00
Victoria X Lin
b35e8efa87
XGLM paper camera-ready: add XStoryCloze data opensource (#4820)
* add XStoryCloze data

* upload XStoryCloze dataset files to s3 instead of git

* minor fixes

* minor fixes

* minor fixes

* minor fixes

* fix broken dataset doc link
2023-05-08 12:12:23 -07:00
Shuming Hu
3f6ba43f07
Fix DiverseBeamSearch so that no diversity groups will be dropped. (#5069) 2023-04-11 20:20:02 -04:00
Wei-Ning Hsu
176cd93498
add data2vec2 nlp model dictionary (#5045) 2023-03-27 16:57:46 -04:00
Franz Nowak
0338cdc309
fix imports referencing moved metrics.py file (#4840)
* fix imports referencing moved metrics.py file

* Make representation computation branchless in TransformerEncoderBase (#4818)

Summary:
We want to make the computation branchless here because fairseq code may be
exported and traced for deployment purposes, and tracing mechanisms can
break the correctness for a captured program if it's dependent on input data.
In this diff we try to rewrite the code to remove one branch so that tracer
can proceed here and preserve the correct semantics of the model.

Test Plan:
CI

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix Torchscript typing in transformer_encoder.py (#4847)

* Add Generative Spoken Dialogue Language Modeling (#4879)

* Update deprecated torch.qr in glow.py example (#4685)

torch.qr is deprecated for a long time and is being removed by https://github.com/pytorch/pytorch/pull/70989.

This PR makes the example compatible with new and old PyTorch versions.

* Emotion Conversion Paper Open Source (#4895)

* data2vec v2.0 (#4903)

data2v2c 2.0
Co-authored-by: Arun Babu <arbabu@fb.com>
Co-authored-by: Wei-Ning Hsu <wnhsu@csail.mit.edu>

* remove missing config entries when loading task from checkpoint (#4905)

* make apex optional (#4906)

* Add file to generate manifests for stop dataset. (#4891)

* Update STOP dataset README to include proper link. (#4892)

* Update README.md (#4893)

* using foreach to reduce kernel (#4904)

* using foreach to reduce kernel

* set reproducibility to looser threshold

* revert optimzer

* update

* update

* update

* update

* update

* update

* update

Co-authored-by: juntengjia <juntengjia@fb.com>

* Update README.md to add data2vec blog post (#4913)

* Update README.md

* Update config to fix circleci failure (#4949)

https://app.circleci.com/pipelines/github/fairinternal/fairseq-py/12635/workflows/3befbae2-79c4-458d-9fc4-aad4484183b4/jobs/26767

* Generative Spoken Dialogue Language Modeling Paper Open Source (#4957)

* wav2vec2_laser (#4968)

* ASR BLEU tool copied from ust branch into main (#4914)

* Add transcript option for asr-bleu (#4981)

---------

Co-authored-by: zhxchen17 <zhxchen17@outlook.com>
Co-authored-by: zhxchen17 <zhxchen17@fb.com>
Co-authored-by: Nguyen Tu Anh <nguyentuanh208@gmail.com>
Co-authored-by: Sergii Dymchenko <kit1980@gmail.com>
Co-authored-by: Felix Kreuk <felixkreuk@gmail.com>
Co-authored-by: Alexei Baevski <alexei.b@gmail.com>
Co-authored-by: padentomasello <pdtomasello@gmail.com>
Co-authored-by: Junteng Jia <juntengjia@hotmail.com>
Co-authored-by: juntengjia <juntengjia@fb.com>
Co-authored-by: arbabu123 <arbabu@fb.com>
Co-authored-by: dianaml0 <82468439+dianaml0@users.noreply.github.com>
Co-authored-by: Pierre Andrews <mortimer@fb.com>
Co-authored-by: Ilia Kulikov <kulikov@cs.nyu.edu>
Co-authored-by: Xutai Ma <xutaima@gmail.com>
2023-02-23 16:18:36 -05:00
Xutai Ma
ad0e69cd99
Add transcript option for asr-bleu (#4981) 2023-02-08 23:16:22 -05:00
Ilia Kulikov
214c0cbd6f
ASR BLEU tool copied from ust branch into main (#4914) 2023-02-02 15:11:56 -08:00
Pierre Andrews
83c4e49e9d
wav2vec2_laser (#4968) 2023-02-02 17:04:53 +01:00
Nguyen Tu Anh
e7f6596bd4
Generative Spoken Dialogue Language Modeling Paper Open Source (#4957) 2023-01-26 12:40:20 -05:00
dianaml0
7d050ada7d
Update config to fix circleci failure (#4949)
https://app.circleci.com/pipelines/github/fairinternal/fairseq-py/12635/workflows/3befbae2-79c4-458d-9fc4-aad4484183b4/jobs/26767
2023-01-19 10:43:10 -05:00
arbabu123
58cc6cca18
Update README.md to add data2vec blog post (#4913)
* Update README.md
2022-12-20 10:34:07 -08:00
Junteng Jia
3c1abb59f5
using foreach to reduce kernel (#4904)
* using foreach to reduce kernel

* set reproducibility to looser threshold

* revert optimzer

* update

* update

* update

* update

* update

* update

* update

Co-authored-by: juntengjia <juntengjia@fb.com>
2022-12-16 17:06:21 -08:00