Commit Graph

38 Commits

Author SHA1 Message Date
Wei
acd9a53607
update isort (#4568)
Co-authored-by: dianaml0 <82468439+dianaml0@users.noreply.github.com>
2022-08-01 14:26:36 -07:00
Wei Wei
d364fdbb26 Reland BT enablement on fairseq - fairseq change (#4513)
Summary:
Pull Request resolved: https://github.com/facebookresearch/fairseq/pull/4513
With some fixes to torchscript using dual copies.
Reland this diff.

Reviewed By: erichan1

Differential Revision: D37371293

fbshipit-source-id: 4fcfc4083955b6f5fc4ef8600f1b517b6ba69aae
2022-06-24 19:03:29 -07:00
Wei Ho
956fcf495b Back out "BT enablement on fairseq - fairseq change"
Summary:
Context: https://fburl.com/7vdj7vhl

Backing out due to breaking our TorchScript test:
```
RuntimeError:
method cannot be used as a value:
  File "/dev/shm/uid-30041/54641b26-seed-nspid4026533396_cgpid7154327-ns-4026533393/fairseq/modules/transformer_layer.py", line 307
                self.in_proj_weight,
                self.in_proj_bias,
                self.self_attn.out_proj.weight,
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
                self.self_attn.out_proj.bias,
                self.activation_relu_or_gelu == 2,

Stack trace:
Exception type: torch::jit::ErrorReport
```
https://fburl.com/sandcastle/4pzqemf5

Original commit changeset: 984266f850fc

Original Phabricator Diff: D37082681 (3a757d7ab2)

Differential Revision: D37303846

fbshipit-source-id: 1757ea5dae98be5beb4d08f70b0c3001d6ea336f
2022-06-21 17:27:50 -07:00
Wei Wei
3a757d7ab2 BT enablement on fairseq - fairseq change (#4480)
Summary:
Pull Request resolved: https://github.com/facebookresearch/fairseq/pull/4480

as titled and depends on D36057338
Fork the inference path inside the forward function. If loaded the checkpoint file and perform the inference, we will deploy BT. Otherwise, fairseq take the position.

In summary:
Accuracy: accuracy loss due to the fp16, the maximum diff is around 0.009. If we set it to fp32, there is no accuracy loss
Perf: the current fairseq has similar speed as vanilla version. After the enablement, the speedup is similar to standalone BT test.
With batch size=64
For V100, the speedup reaches to 1.23x
For A100, the speedup reaches to 1.38x

After enable nested tensor,
For V100, the speedup reaches to 2.46x

Reviewed By: mikekgfb

Differential Revision: D37082681

fbshipit-source-id: 984266f850fc30603e48be56e41ac2c67da080f5
2022-06-15 21:48:41 -07:00
Diana Liskovich
a54021305d formatting fix (#2816)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
fix `black` failures

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2816

Reviewed By: alexeib

Differential Revision: D33172615

Pulled By: dianaml0

fbshipit-source-id: 36b141f42941670f1bfa981041d878042feb0428
2021-12-16 16:11:19 -08:00
Jingfei Du
16ebfa752c Revert preix beamsearch fix (#2763)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
reverting to fix issue mentioned [here](https://github.com/pytorch/fairseq/issues/3913). Having another PR for fixing the original issue later.

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2763

Reviewed By: myleott

Differential Revision: D33000411

Pulled By: jingfeidu

fbshipit-source-id: 95a54cbdc612129a0eab4b5e6aa576a5bcf00588
2021-12-14 13:22:09 -08:00
dianaml0
88e7d2586b fix flake8 issues (#2570)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
- [x] applies flake8 fixes to main branch (https://github.com/fairinternal/fairseq-py/issues/2546) - still more to be fixed

Fix GPU tests:
- [x] when torch.ao.quantization import doesn't work use torch.quantization
- [x] build apex from earlier commit in circleci so that its compatible with pytorch 1.8 and 1.9

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2570

Reviewed By: Mortimerp9

Differential Revision: D32955312

Pulled By: dianaml0

fbshipit-source-id: e163cbd4998f171f819e31b0682c1c0f1986f9e1
2021-12-09 02:34:30 -08:00
dianaml0
0dfd6b6240 Add linting with black (#2678)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2678

Reviewed By: Mortimerp9

Differential Revision: D32653381

Pulled By: dianaml0

fbshipit-source-id: 2810d14867cd7d64f4d340740e2b590b82de47fe
2021-11-29 12:32:59 -08:00
Jingfei Du
932a3d4aad fix beam search with prefix tokens (#2227)
Summary:
1. added test for genereting pad tokens during beam search with prefix
tokens
2. modified lprobs for pad token and prefix tokens to avoid generating
pad

# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2227

Reviewed By: xianxl

Differential Revision: D30649356

Pulled By: jingfeidu

fbshipit-source-id: d94903a912e767391c8fca61f98f65b5cea3b56e
2021-08-30 18:07:13 -07:00
Sam Shleifer
bff7f85206 fastseq ngram blocking (#1509)
Summary:
Command:
```bash
fairseq-generate \
    ~myleott/data/data-bin/wmt16_en_de_bpe32k/ \
    --path /checkpoint/myleott/s3/models/wmt16.en-de.joined-dict.transformer/model.pt \
    --beam 4 --remove-bpe --lenpen 0.6 --batch-size 256 --no-repeat-ngram-size 3 \
    --gen-subset test --fp16
```

master/devfair: 297.8s (10.08 sentences/s, 286.47 tokens/s)
branch/devfair: 31.9s (94.27 sentences/s, 2678.66 tokens/s)

master/v100: 227.4s (13.21 sentences/s, 375.24 tokens/s)
branch/v100: 13.1s (228.68 sentences/s, 6497.99 tokens/s)
(all BLEU4=29.17)

### ToDo:
- tests

### Future Work
- test other fastseq proposed improvements.

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1509

Reviewed By: myleott

Differential Revision: D25587857

Pulled By: sshleifer

fbshipit-source-id: d42af5c50e3f94c90e878f92da5ce5ef3fc8b988
2020-12-30 12:58:09 -08:00
Myle Ott
a48f235636 Apply black+isort (#1357)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1357

Reviewed By: alexeib

Differential Revision: D24377772

fbshipit-source-id: 51581af041d42d62166b33a35a1a4228b1a76f0c
2020-10-18 18:14:51 -07:00
Mu Tian
42c5dcbd18 hydra fairseq 3 - inherit from legacy for fairseq classes
Summary: hydra fairseq 3 - inherit from legacy for fairseq classes

Reviewed By: alexeib

Differential Revision: D23375457

fbshipit-source-id: ef9d19f2d02f2326eea44a70f1f6e1668b420840
2020-09-09 17:02:13 -07:00
Marco Gaido
11345a7608 Pass all net_inputs in SequenceGenerator (#2090)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?

## What does this PR do?
Fixes https://github.com/pytorch/fairseq/issues/2022.

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2090

Reviewed By: cndn

Differential Revision: D21385984

Pulled By: myleott

fbshipit-source-id: 1428e02e625b8625df71a83c05dcf933c3f899df
2020-05-10 06:13:06 -07:00
Myle Ott
7a6519f84f Bugfixes (#1159)
Summary:
Several bugfixes to get tests passing on OSS master
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1159

Reviewed By: ngoyal2707

Differential Revision: D21331993

Pulled By: myleott

fbshipit-source-id: 327ae19f6797f92b8c6083a49d5f5edb0872223e
2020-05-01 04:09:37 -07:00
Ning Dong
b1af3e33d5 Modify gated unit tests to fix Fairseq OSS (#2059)
Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2059

test_ensemble_sequence_generator and test_export_ensemble_model are green on fbcode master but Pytorch 1.5 release cut happened before the TorchScript fix, so updating the gate to 1.6
Remove quantization test from fairseq as FBGEMMS is binded at OSS side. Will add the test back in fbtranslate but land this first to fix OSS side failures.

Reviewed By: myleott

Differential Revision: D21231873

fbshipit-source-id: 8a2ad7dbed118ca8e3f4c351c399a82fd9740445
2020-04-24 13:29:50 -07:00
Ning Dong
b142b7d9ec Script _no_repeat_ngram in fb_simple_sequence_generator (#1963)
Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1963

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1128

It's a common issue that short inputs (< 5 tokens) get repeated due to default length constraint (max_len_a=1.1, max_len_b=5) https://fb.workplace.com/groups/2286753504877951/permalink/2674177509468880/.

In the future we want to use no_ngram_repeat to handle the issue. The functionality is in sequence generator but it needs to be scripted for production use.

Reviewed By: liuchen9494

Differential Revision: D20801865

fbshipit-source-id: c3085f19921adb85415636d16ce31e3826642335
2020-04-10 14:44:42 -07:00
Ning Dong
08691f8d0b Support quantization in Fairseq Sequence generator
Summary: The fix in MHA is suggested by driazati, to avoid JIT compilation for if branch in MHA forward when in scripting. Without this quantization wouldn't work. Details in https://fb.workplace.com/groups/2240361332735959/permalink/626166461295703/

Reviewed By: jhcross

Differential Revision: D20881076

fbshipit-source-id: b50347b45cd7dbdef02ac7b71316ba734019f57e
2020-04-08 17:48:54 -07:00
Chen Liu
d37529ed23 Script reoder_incremental_state in fairseq baseline model (#1127)
Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1127

Pull Request resolved: https://github.com/pytorch/fairseq/pull/1953

Script the `reorder_incremental_states` in the base FairseqModel
Remove the overwrite scriptable `reorder_incremental_states` in the TransformerModel
Change the decoder_len, since len(Tuple) is supported in Script

Relanded reverted diff D20797390

Reviewed By: myleott

Differential Revision: D20896200

fbshipit-source-id: cc4ae34f89f16007656cce6ec6f7e01b13899278
2020-04-07 15:01:31 -07:00
Chen Liu
1b749f4a34 Deprecate the SequenceGenerator with the Scripted vision (#1120)
Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1120

Pull Request resolved: https://github.com/pytorch/fairseq/pull/1940

Deprecate the SequenceGenerator in Fairseq with the Scripted vision.

Pass all integration unit tests

- Copy ScriptSequenceGenerator to SequenceGenerator:
  - Modified the forward_decoder to fix bug when using adaptive_softmax in `get_prob_normalize` (marked with the inline comment)
   - Add support for other EnsembleModels as input arg (marked with the inline comment)
 - Add `FBEnsembleModelWithFork` to support folk/join in ensemblemodel
   - Add `test_fb_ensemble_model` to test folk/join feature
   - Still have bugs in folk/join feature when running in the Fairseq interface (like generation and interactive). Need further investigation P128130029. cc cndn, jhcross
- Modified SequenceGenerator initialization the interface
- Clear up the codes: delete unused functions `get_normalized_probs` and `_decode`

Reland reverted diff D20685075

Reviewed By: cndn

Differential Revision: D20895977

fbshipit-source-id: 424ee318e67d5d6ffed3edb92c7fa78485ba34af
2020-04-07 13:28:30 -07:00
Aapo Kyrola
966436403e Revert D20685075: Deprecate the SequenceGenerator with the Scripted vision
Differential Revision:
D20685075

Original commit changeset: 046b76874465

fbshipit-source-id: 7ec2a2ca3b90251a560e2323c22b52ec7436fecb
2020-04-07 00:59:53 -07:00
Aapo Kyrola
8a528888e4 Revert D20797390: Script reoder_incremental_state in fairseq baseline model
Differential Revision:
D20797390

Original commit changeset: ab29874973ad

fbshipit-source-id: efd2d720c96ee90d1e8dc36178e04f0bf5510278
2020-04-07 00:59:48 -07:00
Chen Liu
d369c88019 Script reoder_incremental_state in fairseq baseline model (#1127)
Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1127

Pull Request resolved: https://github.com/pytorch/fairseq/pull/1953

Script the `reorder_incremental_states` in the base FairseqModel
Remove the overwrite scriptable `reorder_incremental_states` in the TransformerModel
Change the decoder_len, since len(Tuple) is supported in Script

Reviewed By: myleott

Differential Revision: D20797390

fbshipit-source-id: ab29874973adc5dbd556c591942a0e071c81fc52
2020-04-06 20:40:40 -07:00
Chen Liu
bc93681348 Deprecate the SequenceGenerator with the Scripted vision (#1120)
Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1120

Pull Request resolved: https://github.com/pytorch/fairseq/pull/1940

Deprecate the SequenceGenerator in Fairseq with the Scripted vision.

Pass all integration unit tests

- Copy ScriptSequenceGenerator to SequenceGenerator:
  - Modified the forward_decoder to fix bug when using adaptive_softmax in `get_prob_normalize` (marked with the inline comment)
   - Add support for other EnsembleModels as input arg (marked with the inline comment)
 - Add `FBEnsembleModelWithFork` to support folk/join in ensemblemodel
   - Add `test_fb_ensemble_model` to test folk/join feature
   - Still have bugs in folk/join feature when running in the Fairseq interface (like generation and interactive). Need further investigation P128130029. cc cndn, jhcross
- Modified SequenceGenerator initialization the interface
- Clear up the codes: delete unused functions `get_normalized_probs` and `_decode`

Reviewed By: myleott

Differential Revision: D20685075

fbshipit-source-id: 046b76874465a70d8118a97ad670311c6ce1d1c8
2020-04-06 17:47:47 -07:00
Marco Gaido
431d604f69 Fix generation with encoder which return an output of shape different from the input (#1792)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?

## What does this PR do?
Fixes https://github.com/pytorch/fairseq/issues/1791.

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1792

Reviewed By: jmp84

Differential Revision: D20322704

Pulled By: myleott

fbshipit-source-id: 3cfa1bddda06b966e9dc9bc8ff183009d844b23c
2020-03-10 11:51:08 -07:00
Aleksandra Piktus
fab2e86e51 Add a diverse beam search variant to sequence_generator.py (#953)
Summary:
This PR implements a new generation strategy that we experimented with in project Pinocchio (https://github.com/fairinternal/Pinocchio), see the paper submission in: https://fburl.com/hduj2me7.

Specifically in this PR:
- added a Diverse Beam Search variant as described in https://arxiv.org/abs/1611.08562
- moved the Search object generation out of `sequence_generation.py`, which allows for limiting the number of kwargs passes around
- made sure the above changes are backward compatible based on grep - P124083926
- added test cases covering these scenarios
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/953

Test Plan:
- `python -m unittest tests.test_binaries -v`- including added test cases, see issues below for some details
- `python -m unittest tests.test_sequence_generator -v` - including added test cases
- tested locally in conjunction with the Pinocchio repo
- grepped for all instantiations of `SequenceGeneration`, made sure they're backward compatible

# Issues
- when I try to run all tests with `python -m unittest tests.test_binaries -v` command, the execution gets stuck on `test_binaries.TestTranslation.test_generation` - the test otherwise passes without problems when ran individually. Is this a known problem?
- discovered T59235948 - assigned to fairseq oncall

Reviewed By: myleott, fabiopetroni

Differential Revision: D19142394

Pulled By: ola13

fbshipit-source-id: d24543424c14a9537e7b6485951d9f841da62b07
2020-01-06 08:24:02 -08:00
Myle Ott
4abadbdf77 Fix sampling with beam>1
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/792

Differential Revision: D16591987

Pulled By: myleott

fbshipit-source-id: d27c490ae75f80ded19226b8384f4776485dd694
2019-08-01 07:34:06 -07:00
Myle Ott
e75cff5f2c Relicense fairseq under MIT license (#786)
Summary:
The previous BSD+PATENTS license was controversial. We have been
approved to relicense fairseq under the MIT license.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/786

Differential Revision: D16560654

Pulled By: myleott

fbshipit-source-id: f78b1beb4f2895dd7b9bfc79f5f952a2bfb94034
2019-07-30 07:48:23 -07:00
Xing Zhou
e46b924dea Nucleus (top-P) sampling (#710)
Summary:
Implement Nucleus (top-P) sampling: sample among the smallest set of elements whose cumulative probability mass exceeds p.

To test it:
python generate.py   ~myleott/data/data-bin/wmt17_zh_en_full/   --path ~myleott/zh_en/model.pt   --remove-bpe   --nbest 5   --beam 5 --sampling --sampling-topp 0.3
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/710

Test Plan:
python generate.py   ~myleott/data/data-bin/wmt17_zh_en_full/   --path ~myleott/zh_en/model.pt   --remove-bpe   --nbest 5   --beam 5 --sampling --sampling-topp 0.3

python tests/test_sequence_generator.py

python tests/test_binaries.py

Reviewed By: myleott

Differential Revision: D16286688

Pulled By: xingz9

fbshipit-source-id: 1776d21e17c4532a3d24ac75bb7e75da9acad58f
2019-07-17 06:21:33 -07:00
Myle Ott
b65c579bed Modularize generate.py (#351)
Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/351

This makes it easier for tasks to plugin to generate.py/interactive.py
Pull Request resolved: https://github.com/pytorch/fairseq/pull/520

Differential Revision: D14183881

Pulled By: myleott

fbshipit-source-id: ede5e53ddc1215ed3b12b8f1eba048c946913c33
2019-02-22 10:08:52 -08:00
Myle Ott
864b89d044 Online backtranslation module
Co-authored-by: liezl200 <lie@fb.com>
2018-09-25 17:36:43 -04:00
Stephen Roller
bfeb773214 Pass encoder_input to generator, rather than src_tokens/src_lengths. 2018-09-25 17:36:43 -04:00
Myle Ott
311d2c6ca9 Revert sequence generator changes 2018-09-25 17:36:43 -04:00
Stephen Roller
e6d45d5cd7 Generator: net_input instead of manual src_tokens. 2018-09-25 17:36:43 -04:00
Myle Ott
8c0ca1a0c1 Diverse Beam Search 2018-09-03 19:15:23 -04:00
Myle Ott
6edf81ddfe
Remove more Variable() calls (#198) 2018-06-25 12:23:04 -04:00
Myle Ott
ff68a9ef50 Add FairseqTask
A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.

Changes:
- Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
- Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
- Remove LEFT_PAD_* constants and make them configurable per task
2018-06-15 13:05:22 -06:00
Myle Ott
d3795d6cd1
Merge internal changes (#136)
Changes:
- 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search
- c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model
- 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates
- small bugfixes for distributed training, LSTM, inverse square root LR scheduler
2018-04-02 10:13:07 -04:00
Myle Ott
6641520612
fairseq-py goes distributed (#106)
This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes.

Changes:
- c7033ef: add support for distributed training! See updated README for usage.
- e016299: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc.
- 154e440: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf
- 90c2973 and 1da6265: improve unit test coverage
2018-02-27 17:09:42 -05:00