fairseq

mirror of https://github.com/facebookresearch/fairseq.git synced 2024-08-17 04:20:36 +03:00

Author	SHA1	Message	Date
Wei	acd9a53607	update isort (#4568 ) Co-authored-by: dianaml0 <82468439+dianaml0@users.noreply.github.com>	2022-08-01 14:26:36 -07:00
Wei Wei	d364fdbb26	Reland BT enablement on fairseq - fairseq change (#4513 ) Summary: Pull Request resolved: https://github.com/facebookresearch/fairseq/pull/4513 With some fixes to torchscript using dual copies. Reland this diff. Reviewed By: erichan1 Differential Revision: D37371293 fbshipit-source-id: 4fcfc4083955b6f5fc4ef8600f1b517b6ba69aae	2022-06-24 19:03:29 -07:00
Wei Ho	956fcf495b	Back out "BT enablement on fairseq - fairseq change" Summary: Context: https://fburl.com/7vdj7vhl Backing out due to breaking our TorchScript test: ``` RuntimeError: method cannot be used as a value: File "/dev/shm/uid-30041/54641b26-seed-nspid4026533396_cgpid7154327-ns-4026533393/fairseq/modules/transformer_layer.py", line 307 self.in_proj_weight, self.in_proj_bias, self.self_attn.out_proj.weight, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE self.self_attn.out_proj.bias, self.activation_relu_or_gelu == 2, Stack trace: Exception type: torch::jit::ErrorReport ``` https://fburl.com/sandcastle/4pzqemf5 Original commit changeset: 984266f850fc Original Phabricator Diff: D37082681 (`3a757d7ab2`) Differential Revision: D37303846 fbshipit-source-id: 1757ea5dae98be5beb4d08f70b0c3001d6ea336f	2022-06-21 17:27:50 -07:00
Wei Wei	3a757d7ab2	BT enablement on fairseq - fairseq change (#4480 ) Summary: Pull Request resolved: https://github.com/facebookresearch/fairseq/pull/4480 as titled and depends on D36057338 Fork the inference path inside the forward function. If loaded the checkpoint file and perform the inference, we will deploy BT. Otherwise, fairseq take the position. In summary: Accuracy: accuracy loss due to the fp16, the maximum diff is around 0.009. If we set it to fp32, there is no accuracy loss Perf: the current fairseq has similar speed as vanilla version. After the enablement, the speedup is similar to standalone BT test. With batch size=64 For V100, the speedup reaches to 1.23x For A100, the speedup reaches to 1.38x After enable nested tensor, For V100, the speedup reaches to 2.46x Reviewed By: mikekgfb Differential Revision: D37082681 fbshipit-source-id: 984266f850fc30603e48be56e41ac2c67da080f5	2022-06-15 21:48:41 -07:00
Diana Liskovich	a54021305d	formatting fix (#2816 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? fix `black` failures ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2816 Reviewed By: alexeib Differential Revision: D33172615 Pulled By: dianaml0 fbshipit-source-id: 36b141f42941670f1bfa981041d878042feb0428	2021-12-16 16:11:19 -08:00
Jingfei Du	16ebfa752c	Revert preix beamsearch fix (#2763 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? reverting to fix issue mentioned [here](https://github.com/pytorch/fairseq/issues/3913). Having another PR for fixing the original issue later. ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2763 Reviewed By: myleott Differential Revision: D33000411 Pulled By: jingfeidu fbshipit-source-id: 95a54cbdc612129a0eab4b5e6aa576a5bcf00588	2021-12-14 13:22:09 -08:00
dianaml0	88e7d2586b	fix flake8 issues (#2570 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? - [x] applies flake8 fixes to main branch (https://github.com/fairinternal/fairseq-py/issues/2546) - still more to be fixed Fix GPU tests: - [x] when torch.ao.quantization import doesn't work use torch.quantization - [x] build apex from earlier commit in circleci so that its compatible with pytorch 1.8 and 1.9 ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2570 Reviewed By: Mortimerp9 Differential Revision: D32955312 Pulled By: dianaml0 fbshipit-source-id: e163cbd4998f171f819e31b0682c1c0f1986f9e1	2021-12-09 02:34:30 -08:00
dianaml0	0dfd6b6240	Add linting with black (#2678 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2678 Reviewed By: Mortimerp9 Differential Revision: D32653381 Pulled By: dianaml0 fbshipit-source-id: 2810d14867cd7d64f4d340740e2b590b82de47fe	2021-11-29 12:32:59 -08:00
Jingfei Du	932a3d4aad	fix beam search with prefix tokens (#2227 ) Summary: 1. added test for genereting pad tokens during beam search with prefix tokens 2. modified lprobs for pad token and prefix tokens to avoid generating pad # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2227 Reviewed By: xianxl Differential Revision: D30649356 Pulled By: jingfeidu fbshipit-source-id: d94903a912e767391c8fca61f98f65b5cea3b56e	2021-08-30 18:07:13 -07:00
Sam Shleifer	bff7f85206	fastseq ngram blocking (#1509 ) Summary: Command: ```bash fairseq-generate \ ~myleott/data/data-bin/wmt16_en_de_bpe32k/ \ --path /checkpoint/myleott/s3/models/wmt16.en-de.joined-dict.transformer/model.pt \ --beam 4 --remove-bpe --lenpen 0.6 --batch-size 256 --no-repeat-ngram-size 3 \ --gen-subset test --fp16 ``` master/devfair: 297.8s (10.08 sentences/s, 286.47 tokens/s) branch/devfair: 31.9s (94.27 sentences/s, 2678.66 tokens/s) master/v100: 227.4s (13.21 sentences/s, 375.24 tokens/s) branch/v100: 13.1s (228.68 sentences/s, 6497.99 tokens/s) (all BLEU4=29.17) ### ToDo: - tests ### Future Work - test other fastseq proposed improvements. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1509 Reviewed By: myleott Differential Revision: D25587857 Pulled By: sshleifer fbshipit-source-id: d42af5c50e3f94c90e878f92da5ce5ef3fc8b988	2020-12-30 12:58:09 -08:00
Myle Ott	a48f235636	Apply black+isort (#1357 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1357 Reviewed By: alexeib Differential Revision: D24377772 fbshipit-source-id: 51581af041d42d62166b33a35a1a4228b1a76f0c	2020-10-18 18:14:51 -07:00
Mu Tian	42c5dcbd18	hydra fairseq 3 - inherit from legacy for fairseq classes Summary: hydra fairseq 3 - inherit from legacy for fairseq classes Reviewed By: alexeib Differential Revision: D23375457 fbshipit-source-id: ef9d19f2d02f2326eea44a70f1f6e1668b420840	2020-09-09 17:02:13 -07:00
Marco Gaido	11345a7608	Pass all net_inputs in SequenceGenerator (#2090 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [x] Did you write any new necessary tests? ## What does this PR do? Fixes https://github.com/pytorch/fairseq/issues/2022. ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/pytorch/fairseq/pull/2090 Reviewed By: cndn Differential Revision: D21385984 Pulled By: myleott fbshipit-source-id: 1428e02e625b8625df71a83c05dcf933c3f899df	2020-05-10 06:13:06 -07:00
Myle Ott	7a6519f84f	Bugfixes (#1159 ) Summary: Several bugfixes to get tests passing on OSS master Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1159 Reviewed By: ngoyal2707 Differential Revision: D21331993 Pulled By: myleott fbshipit-source-id: 327ae19f6797f92b8c6083a49d5f5edb0872223e	2020-05-01 04:09:37 -07:00
Ning Dong	b1af3e33d5	Modify gated unit tests to fix Fairseq OSS (#2059 ) Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/2059 test_ensemble_sequence_generator and test_export_ensemble_model are green on fbcode master but Pytorch 1.5 release cut happened before the TorchScript fix, so updating the gate to 1.6 Remove quantization test from fairseq as FBGEMMS is binded at OSS side. Will add the test back in fbtranslate but land this first to fix OSS side failures. Reviewed By: myleott Differential Revision: D21231873 fbshipit-source-id: 8a2ad7dbed118ca8e3f4c351c399a82fd9740445	2020-04-24 13:29:50 -07:00
Ning Dong	b142b7d9ec	Script _no_repeat_ngram in fb_simple_sequence_generator (#1963 ) Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1963 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1128 It's a common issue that short inputs (< 5 tokens) get repeated due to default length constraint (max_len_a=1.1, max_len_b=5) https://fb.workplace.com/groups/2286753504877951/permalink/2674177509468880/. In the future we want to use no_ngram_repeat to handle the issue. The functionality is in sequence generator but it needs to be scripted for production use. Reviewed By: liuchen9494 Differential Revision: D20801865 fbshipit-source-id: c3085f19921adb85415636d16ce31e3826642335	2020-04-10 14:44:42 -07:00
Ning Dong	08691f8d0b	Support quantization in Fairseq Sequence generator Summary: The fix in MHA is suggested by driazati, to avoid JIT compilation for if branch in MHA forward when in scripting. Without this quantization wouldn't work. Details in https://fb.workplace.com/groups/2240361332735959/permalink/626166461295703/ Reviewed By: jhcross Differential Revision: D20881076 fbshipit-source-id: b50347b45cd7dbdef02ac7b71316ba734019f57e	2020-04-08 17:48:54 -07:00
Chen Liu	d37529ed23	Script reoder_incremental_state in fairseq baseline model (#1127 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1127 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1953 Script the `reorder_incremental_states` in the base FairseqModel Remove the overwrite scriptable `reorder_incremental_states` in the TransformerModel Change the decoder_len, since len(Tuple) is supported in Script Relanded reverted diff D20797390 Reviewed By: myleott Differential Revision: D20896200 fbshipit-source-id: cc4ae34f89f16007656cce6ec6f7e01b13899278	2020-04-07 15:01:31 -07:00
Chen Liu	1b749f4a34	Deprecate the SequenceGenerator with the Scripted vision (#1120 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1120 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1940 Deprecate the SequenceGenerator in Fairseq with the Scripted vision. Pass all integration unit tests - Copy ScriptSequenceGenerator to SequenceGenerator: - Modified the forward_decoder to fix bug when using adaptive_softmax in `get_prob_normalize` (marked with the inline comment) - Add support for other EnsembleModels as input arg (marked with the inline comment) - Add `FBEnsembleModelWithFork` to support folk/join in ensemblemodel - Add `test_fb_ensemble_model` to test folk/join feature - Still have bugs in folk/join feature when running in the Fairseq interface (like generation and interactive). Need further investigation P128130029. cc cndn, jhcross - Modified SequenceGenerator initialization the interface - Clear up the codes: delete unused functions `get_normalized_probs` and `_decode` Reland reverted diff D20685075 Reviewed By: cndn Differential Revision: D20895977 fbshipit-source-id: 424ee318e67d5d6ffed3edb92c7fa78485ba34af	2020-04-07 13:28:30 -07:00
Aapo Kyrola	966436403e	Revert D20685075: Deprecate the SequenceGenerator with the Scripted vision Differential Revision: D20685075 Original commit changeset: 046b76874465 fbshipit-source-id: 7ec2a2ca3b90251a560e2323c22b52ec7436fecb	2020-04-07 00:59:53 -07:00
Aapo Kyrola	8a528888e4	Revert D20797390: Script reoder_incremental_state in fairseq baseline model Differential Revision: D20797390 Original commit changeset: ab29874973ad fbshipit-source-id: efd2d720c96ee90d1e8dc36178e04f0bf5510278	2020-04-07 00:59:48 -07:00
Chen Liu	d369c88019	Script reoder_incremental_state in fairseq baseline model (#1127 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1127 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1953 Script the `reorder_incremental_states` in the base FairseqModel Remove the overwrite scriptable `reorder_incremental_states` in the TransformerModel Change the decoder_len, since len(Tuple) is supported in Script Reviewed By: myleott Differential Revision: D20797390 fbshipit-source-id: ab29874973adc5dbd556c591942a0e071c81fc52	2020-04-06 20:40:40 -07:00
Chen Liu	bc93681348	Deprecate the SequenceGenerator with the Scripted vision (#1120 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1120 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1940 Deprecate the SequenceGenerator in Fairseq with the Scripted vision. Pass all integration unit tests - Copy ScriptSequenceGenerator to SequenceGenerator: - Modified the forward_decoder to fix bug when using adaptive_softmax in `get_prob_normalize` (marked with the inline comment) - Add support for other EnsembleModels as input arg (marked with the inline comment) - Add `FBEnsembleModelWithFork` to support folk/join in ensemblemodel - Add `test_fb_ensemble_model` to test folk/join feature - Still have bugs in folk/join feature when running in the Fairseq interface (like generation and interactive). Need further investigation P128130029. cc cndn, jhcross - Modified SequenceGenerator initialization the interface - Clear up the codes: delete unused functions `get_normalized_probs` and `_decode` Reviewed By: myleott Differential Revision: D20685075 fbshipit-source-id: 046b76874465a70d8118a97ad670311c6ce1d1c8	2020-04-06 17:47:47 -07:00
Marco Gaido	431d604f69	Fix generation with encoder which return an output of shape different from the input (#1792 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [x] Did you write any new necessary tests? ## What does this PR do? Fixes https://github.com/pytorch/fairseq/issues/1791. ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/pytorch/fairseq/pull/1792 Reviewed By: jmp84 Differential Revision: D20322704 Pulled By: myleott fbshipit-source-id: 3cfa1bddda06b966e9dc9bc8ff183009d844b23c	2020-03-10 11:51:08 -07:00
Aleksandra Piktus	fab2e86e51	Add a diverse beam search variant to sequence_generator.py (#953 ) Summary: This PR implements a new generation strategy that we experimented with in project Pinocchio (https://github.com/fairinternal/Pinocchio), see the paper submission in: https://fburl.com/hduj2me7. Specifically in this PR: - added a Diverse Beam Search variant as described in https://arxiv.org/abs/1611.08562 - moved the Search object generation out of `sequence_generation.py`, which allows for limiting the number of kwargs passes around - made sure the above changes are backward compatible based on grep - P124083926 - added test cases covering these scenarios Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/953 Test Plan: - `python -m unittest tests.test_binaries -v`- including added test cases, see issues below for some details - `python -m unittest tests.test_sequence_generator -v` - including added test cases - tested locally in conjunction with the Pinocchio repo - grepped for all instantiations of `SequenceGeneration`, made sure they're backward compatible # Issues - when I try to run all tests with `python -m unittest tests.test_binaries -v` command, the execution gets stuck on `test_binaries.TestTranslation.test_generation` - the test otherwise passes without problems when ran individually. Is this a known problem? - discovered T59235948 - assigned to fairseq oncall Reviewed By: myleott, fabiopetroni Differential Revision: D19142394 Pulled By: ola13 fbshipit-source-id: d24543424c14a9537e7b6485951d9f841da62b07	2020-01-06 08:24:02 -08:00
Myle Ott	4abadbdf77	Fix sampling with beam>1 Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/792 Differential Revision: D16591987 Pulled By: myleott fbshipit-source-id: d27c490ae75f80ded19226b8384f4776485dd694	2019-08-01 07:34:06 -07:00
Myle Ott	e75cff5f2c	Relicense fairseq under MIT license (#786 ) Summary: The previous BSD+PATENTS license was controversial. We have been approved to relicense fairseq under the MIT license. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/786 Differential Revision: D16560654 Pulled By: myleott fbshipit-source-id: f78b1beb4f2895dd7b9bfc79f5f952a2bfb94034	2019-07-30 07:48:23 -07:00
Xing Zhou	e46b924dea	Nucleus (top-P) sampling (#710 ) Summary: Implement Nucleus (top-P) sampling: sample among the smallest set of elements whose cumulative probability mass exceeds p. To test it: python generate.py ~myleott/data/data-bin/wmt17_zh_en_full/ --path ~myleott/zh_en/model.pt --remove-bpe --nbest 5 --beam 5 --sampling --sampling-topp 0.3 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/710 Test Plan: python generate.py ~myleott/data/data-bin/wmt17_zh_en_full/ --path ~myleott/zh_en/model.pt --remove-bpe --nbest 5 --beam 5 --sampling --sampling-topp 0.3 python tests/test_sequence_generator.py python tests/test_binaries.py Reviewed By: myleott Differential Revision: D16286688 Pulled By: xingz9 fbshipit-source-id: 1776d21e17c4532a3d24ac75bb7e75da9acad58f	2019-07-17 06:21:33 -07:00
Myle Ott	b65c579bed	Modularize generate.py (#351 ) Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/351 This makes it easier for tasks to plugin to generate.py/interactive.py Pull Request resolved: https://github.com/pytorch/fairseq/pull/520 Differential Revision: D14183881 Pulled By: myleott fbshipit-source-id: ede5e53ddc1215ed3b12b8f1eba048c946913c33	2019-02-22 10:08:52 -08:00
Myle Ott	864b89d044	Online backtranslation module Co-authored-by: liezl200 <lie@fb.com>	2018-09-25 17:36:43 -04:00
Stephen Roller	bfeb773214	Pass encoder_input to generator, rather than src_tokens/src_lengths.	2018-09-25 17:36:43 -04:00
Myle Ott	311d2c6ca9	Revert sequence generator changes	2018-09-25 17:36:43 -04:00
Stephen Roller	e6d45d5cd7	Generator: net_input instead of manual src_tokens.	2018-09-25 17:36:43 -04:00
Myle Ott	8c0ca1a0c1	Diverse Beam Search	2018-09-03 19:15:23 -04:00
Myle Ott	6edf81ddfe	Remove more Variable() calls (#198 )	2018-06-25 12:23:04 -04:00
Myle Ott	ff68a9ef50	Add FairseqTask A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss. Changes: - Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator. - Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position - Remove LEFT_PAD_* constants and make them configurable per task	2018-06-15 13:05:22 -06:00
Myle Ott	d3795d6cd1	Merge internal changes (#136 ) Changes: - `7d19e36`: Add `--sampling` flag to generate.py to sample instead of doing beam search - `c777340`: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model - `3ea882c`: Add `--max-update` option to train.py to stop training after a given number of updates - small bugfixes for distributed training, LSTM, inverse square root LR scheduler	2018-04-02 10:13:07 -04:00
Myle Ott	6641520612	fairseq-py goes distributed (#106 ) This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes. Changes: - `c7033ef`: add support for distributed training! See updated README for usage. - `e016299`: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc. - `154e440`: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf - `90c2973` and `1da6265`: improve unit test coverage	2018-02-27 17:09:42 -05:00

38 Commits