fairseq

mirror of https://github.com/facebookresearch/fairseq.git synced 2024-10-27 01:41:27 +03:00

Author	SHA1	Message	Date
Gor Arakelyan	06c65c8297	Add Aim support for logging (#4311 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Enables logging of params and metrics with Aim. Aim is an open-source experiment tracker - https://github.com/aimhubio/aim 1. Added two arguments to CommonConfig: - aim_repo: defines Aim repository location, can be set to remote URL as well(i.e. `aim://<ip>:<port>`) - aim_run_hash: defines run hash. If skipped, run will be created or continued based on `save_dir` argument. If there is an existing run which has the same `save_dir`, it will be reopened/continued, otherwise a new run will be created. 2. Implemented AimProgressBarWrapper class to handle logging Pull Request resolved: https://github.com/pytorch/fairseq/pull/4311 Reviewed By: ArmenAg Differential Revision: D35177412 Pulled By: dianaml0 fbshipit-source-id: 287afe3a77e1048e497a4e1bdc42efd46ec9c2fe	2022-03-29 10:38:10 -07:00
Diana Liskovich	7d72f28db5	formatting fix (#4313 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/pytorch/fairseq/pull/4313 Reviewed By: shruti-bh Differential Revision: D35200613 Pulled By: dianaml0 fbshipit-source-id: c011f89f4a7ee9404bec61728b52fcea8640d292	2022-03-29 07:20:46 -07:00
Angela Fan	54ea689ac5	adding readme (#4314 ) Summary: adding for generating biographies paper Pull Request resolved: https://github.com/pytorch/fairseq/pull/4314 Reviewed By: edunov Differential Revision: D35205567 Pulled By: huihuifan fbshipit-source-id: 7698672dcffbdb8a10bfea4f72920e1f508a4104	2022-03-29 07:06:16 -07:00
Diana Liskovich	fef5006caa	formatting fix (#4310 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fix issue with `black` causing build error. ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/pytorch/fairseq/pull/4310 Reviewed By: shruti-bh Differential Revision: D35151101 Pulled By: dianaml0 fbshipit-source-id: 63d80b848fdd3c004d784add3bf74e4c5281e952	2022-03-28 15:13:02 -07:00
Ann Lee	c8a8e2c392	release pre-trained models (#3245 ) Summary: Releasing pre-trained mHuBERT, vocoder, speech normalizer for the paper "Textless Speech-to-Speech Translation on Real Data" # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � X-link: https://github.com/fairinternal/fairseq-py/pull/3245 Reviewed By: sravyapopuri388 Differential Revision: D35135891 Pulled By: an918tw fbshipit-source-id: 96e0a6354dc61d5cbfce9943893bebadfb21b642	2022-03-28 13:26:07 -07:00
Changhan Wang	52658402c5	add CTC auxiliary loss to S2T Transformer Summary: Add CTC auxiliary loss to S2T Transformer Reviewed By: sravyapopuri388 Differential Revision: D33305481 fbshipit-source-id: d866a924e39beb03a2f8a59f7051b6c81980ad35	2022-03-25 18:01:44 -07:00
Vineel Pratap	e9b89525b5	Fix an indentation issue for decoder sweep config Summary: As per title Created from CodeHub with https://fburl.com/edit-in-codehub Reviewed By: arbabu123 Differential Revision: D35151134 fbshipit-source-id: bb97ae583542c8e7983b9d9042d8a3084b8fbef5	2022-03-25 16:18:47 -07:00
Sravya Popuri	7b9118bd93	Open source code for "Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation" (#3233 ) Summary: OSS "Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation" paper code - Update xm_transformer to add a new arguments called encoder_proj (which ensures the encoder embedding dim and decoder embedding dim are matched) and max_positions (related to embedding size of conformer). - Add documentation and pretrained models related to the paper X-link: https://github.com/fairinternal/fairseq-py/pull/3233 Reviewed By: pipibjc Differential Revision: D35119604 Pulled By: sravyapopuri388 fbshipit-source-id: bbe517c4803c5808f8cce0e5d16cf5ffa96f425c	2022-03-25 11:52:07 -07:00
Wei Ho	f71c03fba8	Don't fsdp_wrap transformer encoder & decoder Summary: Per anj-s 's suggestion - this seems to fix the ``` assert len(self.flat_params) == 1, "Incorrect access to flat_param" AssertionError: Incorrect access to flat_param ``` error when training transformer models w/ large number of params ~~(not sure why the number of params affect fairscale FSDP wrapping???)~~ Did this maybe only manifest when the encoder/decoder individually had > 1e8 params due to the default of `min_params_to_wrap`? Looking at D26771144 (`656d7e5779`) & https://github.com/fairinternal/fairseq-py/pull/1667 where this code was added - it's unclear why wrapping was specifically necessary when share_all_embeddings=False? Is it OK to just delete this code? (And did the gshard model avoid this issue b/c it used share_all_embeddings=True?) Reviewed By: huihuifan Differential Revision: D35084649 fbshipit-source-id: ad5b394c9920e3bea2767a0771f6de36aecb3687	2022-03-24 11:33:07 -07:00
Hongyu Gong	b554f5ec90	replace "prepend-tgt-lang-tag" with "prepend-tgt-lang-tag-as-bos" to avoid confusion Summary: Replace "prepend-tgt-lang-tag" with "prepend-tgt-lang-tag-as-bos" in s2s data loading and s2s task. Reviewed By: yuntang Differential Revision: D34912239 fbshipit-source-id: 654d0eafafc275be6c2470b08a323f57a4f9b9cb	2022-03-15 22:55:21 -07:00
Hongyu Gong	f9d07a9209	support tgt_lang_tag in speech-to-speech (#3187 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Support tgt-lang-tag in speech-to-speech task. 1. If we set prepend_tgt_lang_tag: true, a dictionary with units and lang tags would be loaded from vocab_filename; otherwise, a dictionary is created with units only in setup_task. 2. prepend_tgt_lang_tag would add the target language token to the beginning of prev_output_tokens during data loading. ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � X-link: https://github.com/fairinternal/fairseq-py/pull/3187 Reviewed By: yuntang Differential Revision: D34768755 Pulled By: hygong-fb fbshipit-source-id: fa395c3319907221f95333283689671b194f3ccc	2022-03-14 09:06:49 -07:00
Sravya Popuri	d03f4e7714	Minor fixes (#3198 ) Summary: - Fix error introduced in `e55e094b96` in the case where net_input doesn't have prev_output_tokens key - Fix typo in covost README. X-link: https://github.com/fairinternal/fairseq-py/pull/3198 Reviewed By: cndn, kahne Differential Revision: D34810092 Pulled By: sravyapopuri388 fbshipit-source-id: 9be6e6f06586cd2a2d44415ebf7c3596a5334b81	2022-03-11 09:23:12 -08:00
Wei Ho	0f078de343	Re-land D34058196 [sacrebleu==2.0.0] buckification [Back out D34503161] Reviewed By: shreyanb98 Differential Revision: D34541824 fbshipit-source-id: 1dffc28bca971310920e1b1fdfe4016cc1aa1ceb	2022-03-07 17:04:45 -08:00
Dmitry Vinnik	592c1227f4	docs: add social button in support of Ukraine (#4249 ) Summary: Our mission at Meta Open Source is to empower communities through open source, and we believe that it means building a welcoming and safe environment for all. As a part of this work, we are adding this banner in support for Ukraine during this crisis. Pull Request resolved: https://github.com/pytorch/fairseq/pull/4249 Reviewed By: arbabu123 Differential Revision: D34635479 Pulled By: dmitryvinn-fb fbshipit-source-id: 488d30f0967ae9542ead968c5cb951ecf0e02a64	2022-03-04 16:28:09 -08:00
Wei-Ning Hsu	2e0b961a0e	fix import_user_module (#3144 ) Summary: ## What does this PR do? Avoid throwing ValueError when attempting to load a user defined module from common.user_dir that has the same module name and same module path as some loaded module. This occurs when a job is preempted and restarts using submitit_slurm X-link: https://github.com/fairinternal/fairseq-py/pull/3144 Reviewed By: Abdel-rahmanMohamed Differential Revision: D34521450 Pulled By: wnhsu fbshipit-source-id: eed00d4238a66dc524eee400a55ad2c011e1543c	2022-03-02 23:58:52 -08:00
Ann Lee	1479d311d5	update s2st vocoder training instructions (#3156 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? release training instructions for unit-based HiFi-GAN vocoder with duration prediction ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � X-link: https://github.com/fairinternal/fairseq-py/pull/3156 Reviewed By: sravyapopuri388 Differential Revision: D34582951 Pulled By: an918tw fbshipit-source-id: 2e575fb15aa8cd5444272c3c31426ac64da84e97	2022-03-02 12:19:20 -08:00
Igor Shalyminov	e55e094b96	AddTargetDataset now first adds EOS then pads target sequences (#4243 ) Summary: # Before submitting - [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) https://groups.google.com/g/fairseq-users/c/YoSm5J2To1A - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes https://github.com/pytorch/fairseq/issues/4242 ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/pytorch/fairseq/pull/4243 Reviewed By: arbabu123 Differential Revision: D34538164 Pulled By: alexeib fbshipit-source-id: cf2fdaa7663bee34571fb3d3bd9bdaf79d756206	2022-02-28 20:43:50 -08:00
Hetarth Chopra	a24fdf2d1b	Fixing and error related to Floor Division (#4221 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes https://github.com/pytorch/fairseq/issues/4058 While using the library the following warnings are shown which sometimes hinder the workflow. The warnings are `<USER_PATH>/fairseq/search.py:140: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). beams_buf = indices_buf // vocab_size` `<USER_PATH>/fairseq/sequence_generator.py:666: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). unfin_idx = bbsz_idx // beam_size` The methodology was simple, instead of using the `//`, it was replaced by `torch.div(arg1, arg2, rounding_mode='trunc')` and the variable alues do not change for both before and after, just the warning is resolved. ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Yes, I did! Thanks! Pull Request resolved: https://github.com/pytorch/fairseq/pull/4221 Reviewed By: arbabu123 Differential Revision: D34538147 Pulled By: alexeib fbshipit-source-id: 143897a249129a163b6a30ba9b5cf5595ef42330	2022-02-28 20:21:53 -08:00
Sravya Popuri	d421749323	Add s2s_conformer model to support conformer encoder in S2UT model (#3113 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � X-link: https://github.com/fairinternal/fairseq-py/pull/3113 Reviewed By: an918tw, kahne Differential Revision: D34365606 Pulled By: sravyapopuri388 fbshipit-source-id: aa4f0ab24ca191101b9eca0f5e08dcbedf9fadbb	2022-02-28 09:49:11 -08:00
Igor Shalyminov	41847528fb	Best metric is now only logged for the first of all the validation subsets (#4180 ) Summary: Best metric is now only logged for the first of all the validation subsets # Before submitting - [x ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) https://groups.google.com/g/fairseq-users/c/7nk3rJmvlg8 - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes https://github.com/pytorch/fairseq/issues/4162 ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/pytorch/fairseq/pull/4180 Reviewed By: michaelauli Differential Revision: D34365416 Pulled By: alexeib fbshipit-source-id: 872f77da2cbf064ed838ebc7959365b0b33fe723	2022-02-25 14:29:43 -08:00
spopuri	5175fd5c26	update readme for conformer based models (#3104 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � X-link: https://github.com/fairinternal/fairseq-py/pull/3104 Reviewed By: kahne Differential Revision: D34323889 Pulled By: sravyapopuri388 fbshipit-source-id: da7216bc5918fd0e57e10395044088a555af2e07	2022-02-23 15:49:12 -08:00
eugene-kharitonov	0c0ef06780	Prosody-aware Generative Spoken Language Modelling (#3063 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3063 Reviewed By: eugene-kharitonov Differential Revision: D34323605 Pulled By: wnhsu fbshipit-source-id: 9dc779a6c399cda710863596e0880b9277ff2919	2022-02-23 00:30:22 -08:00
spopuri	420136acd2	fix failing convtransformer test (#3107 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3107 Reviewed By: cndn Differential Revision: D34354339 Pulled By: sravyapopuri388 fbshipit-source-id: 50888706123d246c13d2cbb22d0e043740ff6bf5	2022-02-22 11:24:11 -08:00
Sravya Popuri	5b87224417	Open source conformer models and update documentation Summary: TSIA Reviewed By: kahne Differential Revision: D34115270 fbshipit-source-id: aa5a226dae4539afc0aed9b7d43ba1fa2e40ae70	2022-02-17 10:34:41 -08:00
Sravya Popuri	67eaecd2fc	Add regression test for SimulConvTransformerModel (#3031 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3031 Reviewed By: kahne Differential Revision: D34018108 Pulled By: sravyapopuri388 fbshipit-source-id: 4db96653658a998b15c0cdbc2e588198d951a420	2022-02-16 09:32:21 -08:00
Victoria Lin	cfc4d8475c	add missing transformer arch and update PadDataset (#4212 ) Summary: Fix issue https://github.com/pytorch/fairseq/issues/4209 #4210 Pull Request resolved: https://github.com/pytorch/fairseq/pull/4212 Reviewed By: sshleifer Differential Revision: D34208212 Pulled By: todpole3 fbshipit-source-id: 64a4777b8721b692ad339df0fc0495d823d58c07	2022-02-16 05:26:11 -08:00
dianaml0	5f2515e676	Fix failing test (#3065 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3065 Reviewed By: Mortimerp9 Differential Revision: D34144674 Pulled By: dianaml0 fbshipit-source-id: 842b0d29c9c85d4b56b640f2823fcb4e3f912f98	2022-02-10 12:17:47 -08:00
Alban Desmaison	5551a1995b	Change ParameterList and ParameterDict to be able to contain any kind of objects (#70499 ) Summary: The only difference with plain list/dict now is that nn.Parameters are handled specially and registered as parameters properly. test_nn and parametrization works locally. Will see in CI if DP is fixed as well. Tentative fix for https://github.com/pytorch/pytorch/issues/36035 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70499 Reviewed By: jbschlosser, alexeib Differential Revision: D34005332 Pulled By: albanD fbshipit-source-id: 7e76b0873d0fec345cb537e2a6ecba0258e662b9	2022-02-09 10:47:56 -08:00
Sravya Popuri	8b02f00e8a	fix s2s test - disable multitasking by setting multitask_config_yaml to None (#3059 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3059 Reviewed By: kahne Differential Revision: D34083178 Pulled By: sravyapopuri388 fbshipit-source-id: a33af1696570be4826973b19fe34177bcf851e06	2022-02-09 10:05:22 -08:00
alexeib	327cff24a5	Create a separate EMA implementation for in-model tracking (#3036 ) Summary: ema.py initially used by data2vec was actually created for trainer-level ema tracking since data2vec creates and uses ema tracking within the model, we will split ema into a different module-level implementation Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3036 Reviewed By: wnhsu Differential Revision: D34034479 Pulled By: alexeib fbshipit-source-id: f8c65552d446f1104c36380f5d1ff22a75e6e405	2022-02-07 15:38:52 -08:00
Sravya Popuri	11b2830d29	Refactor speech tests and add missing regression tests (#3001 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3001 Reviewed By: kahne Differential Revision: D33904550 Pulled By: sravyapopuri388 fbshipit-source-id: f55f8121d83e5abebdfcf7ac90dcba39f65cafaf	2022-02-04 14:35:02 -08:00
Vimal Manohar	6b7a7d6457	Fix EMA GPU test Summary: The GPU test was broken after D33809223 (`1b61bbad32`) Reviewed By: cruvadom Differential Revision: D33931570 fbshipit-source-id: 37962a437d8e25b1dafc58db0efa55c1afa5f3ee	2022-02-04 09:10:06 -08:00
Changhan Wang	53cc55c9c8	add BERTScore scorer Summary: add BERTScore scorer Reviewed By: yuntang Differential Revision: D33881724 fbshipit-source-id: 89f7f7b71a9def28cd8b0366f540e445e74efabb	2022-02-03 14:39:09 -08:00
Wei-Ning Hsu	272c4c5197	Fix hubert (#3019 ) Summary: ## PR review 1. Update HuBERT to work with the TransformerEncoder wav2vec2.py 2. Remove dictionary loading issue when loading fine-tuned HuBERT checkpoints to make the checkpoints self-contained 3. Add unit-test for HuBERT fine-tuned checkpoints 4. Avoid divide-by-zero error in infer.py when inference time is zero (e.g., when inferring just one utterance) Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3019 Reviewed By: andrewyeh Differential Revision: D33970620 Pulled By: wnhsu fbshipit-source-id: c523dd6ddb0f6a496be8b0b4b56f0c32c1d3dbc5	2022-02-03 10:17:10 -08:00
Pierre Andrews	f591cc94ca	upgrade black for lints (#3004 ) Summary: This is the same as https://github.com/fairinternal/fairseq-py/issues/3003 but for main instead of gshard. the lint test will run the latest version of black, which is 22.1.0 right now and seems to be incompatible with the 21.12b0 version that is setup in pre-commit. This means that some files were with valid format in the past, but are not anymore... This PR formats these files with 22.1.0 and autoupdates pre-commit config to use that black version too. (note: this is the second time it happens. a solution would be to pin the lint test to the same version as the one in the pre-commit hook and that was used to format everything clean so that we have a stable formating) Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3004 Reviewed By: dianaml0 Differential Revision: D33917490 Pulled By: Mortimerp9 fbshipit-source-id: d55e800b976f94545cdab4132daa7c45cbd0e34c	2022-02-02 04:31:33 -08:00
Wei-Ning Hsu	5d2be954bb	add defaults again after importing user_module (#3007 ) Summary: ## What does this PR do? Default values for the configs imported from `user_dir` was not added properly. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/3007 Reviewed By: alexeib Differential Revision: D33926315 Pulled By: wnhsu fbshipit-source-id: 914eecec769964686342d66c96d6ba76f12e1277	2022-02-01 21:42:15 -08:00
Victoria X Lin	6b770134a2	Add citation details and other wording fixes to model card (#4172 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/pytorch/fairseq/pull/4172 Reviewed By: punitkoura Differential Revision: D33911169 Pulled By: todpole3 fbshipit-source-id: d3e111ab4b9a646e1799ad9335c70ec1ee8d25a4	2022-02-01 10:57:35 -08:00
Victoria X Lin	790f3be15a	Add XGLM pre-training data format explaination (#4158 ) Summary: 1. Add XGLM pre-training data format explanation 2. Add back pointer to pre-print Pull Request resolved: https://github.com/pytorch/fairseq/pull/4158 Reviewed By: xianxl Differential Revision: D33825440 Pulled By: todpole3 fbshipit-source-id: 379aa55d55ef3c9016987d1f05de023b7a7aee04	2022-02-01 10:30:07 -08:00
Changhan Wang	d839d84f1e	Miscellaneous S2T & S2 bug fixes Summary: Miscellaneous S2T & S2 bug fixes Reviewed By: yuntang Differential Revision: D33469556 fbshipit-source-id: 430c2cad01dd7ea862a6c1564ad609887d66b788	2022-01-31 20:44:43 -08:00
Vimal Manohar	1b61bbad32	Fix broken EMA in fairseq Summary: EMA broken since D33649708 (`995c204337`) due to indentation error. Reviewed By: cruvadom Differential Revision: D33809223 fbshipit-source-id: c6c4d0d327443bfea787817040e1832eef0f50e4	2022-01-27 13:02:58 -08:00
Wei-Ning Hsu	4a7835b794	Hubert unit test (#2766 ) Summary: ## What does this PR do? - Add unit test for HuBERT - update model arg to comply with wav2vec to TranformerEncoder Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2766 Reviewed By: Abdel-rahmanMohamed Differential Revision: D32965218 Pulled By: wnhsu fbshipit-source-id: 036a1644179c35b875c9ba30d75b4ef039fb328f	2022-01-24 16:24:50 -08:00
Victoria X Lin	509e83e432	Add XGLM downstream task evaluation examples (#4154 ) Summary: 1. Add XGLM downstream task evaluation examples 2. Add bibtex citation of XGLM arXiv paper Pull Request resolved: https://github.com/pytorch/fairseq/pull/4154 Reviewed By: xianxl Differential Revision: D33748846 Pulled By: todpole3 fbshipit-source-id: ce4dfce2fccf92742f124f12a0d9a388280320fa	2022-01-24 16:24:47 -08:00
Tony Bruguier	5fd38e3d5b	Fix breakage from D33649708 Summary: https://www.internalfb.com/diff/D33649708 (`995c204337`)?src_version_fbid=1030479880843010&dst_version_fbid=247617347518523&transaction_fbid=1601081576900014 Reviewed By: alexeib Differential Revision: D33696937 fbshipit-source-id: 9a17610e3f4eb3dd2b2131a3f9fb42732a31b47f	2022-01-21 09:27:49 -08:00
alexeib	fc758bbf79	fix readme (#2939 ) Summary: minor fix Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2939 Reviewed By: michaelauli Differential Revision: D33685330 Pulled By: alexeib fbshipit-source-id: 4d6c6edb1fab9d0d56a6e03c0a2b43a864f1d07a	2022-01-20 08:48:41 -08:00
alexeib	c71870f370	Data2vec (#2936 ) Summary: new data2vec models Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2936 Reviewed By: jacobkahn Differential Revision: D33674643 Pulled By: alexeib fbshipit-source-id: 2c2b4fae541974587b50a78a44d34033e9b5192d	2022-01-20 08:28:47 -08:00
alexeib	995c204337	Data2vec prelim (#2929 ) Summary: Preliminaries for data2vec release, include some minor improvements and bug fixes Most important change is that we now default to raising an exception when fields in config do not have a corresponding field in the model dataclass Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2929 Reviewed By: wnhsu Differential Revision: D33649708 Pulled By: alexeib fbshipit-source-id: 629bdb4c361550740b451c570c2005bb956c6fcb	2022-01-20 00:02:16 -08:00
Hongyu Gong	a59cea5944	attn head selection Summary: Add scripts for multihead attention selection in multilingual and multil-domain training from the following paper: "Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling", NeurIPS 2021. Reviewed By: yuntang Differential Revision: D31802221 fbshipit-source-id: 8c69b89bda29e6857bd3af02979c07e1b5cf49f1	2022-01-18 21:15:34 -08:00
Vimal Manohar	a075481d0d	Decode using EMA model in IPL recipe Summary: Add option to use the EMA model for decoding in transducer IPL recipe by passing --ipl-decode-ema. Note EMA should be enabled as in the diff D24238379 (`8feccf9441`) using options --store-ema --ema-start-update and --ema-decay. Reviewed By: cruvadom Differential Revision: D31983366 fbshipit-source-id: 2bf63b3f7d1b5fa8804b3a7e9bfab71a463ca957	2022-01-18 19:29:52 -08:00
Hongyu Gong	40eb7310be	Code cleanup Summary: Add scripts for multihead attention selection in multilingual and multil-domain training from the following paper: "Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling", NeurIPS 2021. Reviewed By: yuntang Differential Revision: D31781212 fbshipit-source-id: 8e1a596826f682f80730c251ec31c68df0de6516	2022-01-18 16:50:41 -08:00
Liang Tan	1575f30dd0	Add ffn prune to fairseq Summary: Support FFN prune for Fairseq. For example, user can apply pruning on top of Roberta base model by specify the argument "--ffn-blocks-to-remove 1024". Also, user needs to provide a ckpt which is already pruned so that the pruned ckpt can be loaded correctly. The idea of prune can be summarized as Fine tune model (e.g. roberta encoder) on a certain datasets with regularization After the model is trained. User could use _get_fc_rank and _prune_fc_layer functions to get the top X blocks with most importance in each transformer layer. Then user uses the rank to prune a new roberta encoder and save the pruned ckpt manually. User will fine tune the the new roberta encoder via the ckpt saved above Reviewed By: dianaml0 Differential Revision: D33525055 fbshipit-source-id: 5087140ee891d6ec9266726e3a477947c233412c	2022-01-14 16:26:59 -08:00

1 2 3 4 5 ...

2148 Commits