Summary:
Incorporate several fixes, incl. from OSS contributors:
- fix model argument in sequence generator in semisupervised_translation.py
- fix aggregate logging in semisupervised_translation.py
- Fix EOS token in multilingual_denoising
- Handle missing eos_idx in data_utils.collate_tokens
- Better OOM handling for single-GPU training
- fix prepend_bos argument in translation_from_pretrained_bart.py …
- Fix eos_idx in multilingual_denoising
- Small logging fixes
- Fix fb_hub on PyTorch 1.6
- Better variable names
- Add support for model parallel to interactive.py
- Use `//` operator to fix Integer division warning
- Set default `--clip-norm=0.0`
- Cleanup some binaries in root directory
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1196
Reviewed By: ngoyal2707
Differential Revision: D22162202
Pulled By: myleott
fbshipit-source-id: 835b0c0ad9246827f9d915fdb4e89d7b5be2475d
Summary:
# Description
In [examples/translation](https://github.com/pytorch/fairseq/tree/master/examples/translation), the code will not run if you change the model from `transformer.wmt16` to `transformer.wmt19`, since the BPE they are using are different. I corrected that with a note at the end of the section.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1951
Reviewed By: ngoyal2707
Differential Revision: D21663490
Pulled By: myleott
fbshipit-source-id: 13010dbec0ef5202355e0b3eb6d77b1958e80e97
Summary:
# Before submitting
- [X] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [X] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [X] Did you make sure to update the docs?
- [X] Did you write any new necessary tests?
## What does this PR do?
Fixes https://github.com/pytorch/fairseq/issues/1777.
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1784
Differential Revision: D20322705
Pulled By: myleott
fbshipit-source-id: 0787225db7f94da0565a2aa7628f2a1ee22f777f
Summary:
Very minor fix to avoid overwriting validation data.
# Before submitting
- [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [x] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?
## What does this PR do?
Fixes https://github.com/pytorch/fairseq/issues/1641.
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1642
Differential Revision: D19555371
Pulled By: myleott
fbshipit-source-id: 2c2dd1d3c66605dd42113f2330ba98fe62c53a92
Summary:
In https://github.com/pytorch/fairseq/issues/656, people are often confused about how to set multilingual translation parameters at inference time.
This diff add more checks to ensure the arguments (`--lang-pairs`, `--encoder-langtok`, `--decoder-langtok`) load from checkpoint are consistent with arguments specified in generate/interactive command line.
We also add a section in example page to explain how to set the arguments
Reviewed By: myleott
Differential Revision: D15682169
fbshipit-source-id: 64e6db94cd72ea7ce2d0aa1067c9c2dcd3b8a2ac
Summary:
* Add example for multilingual translation on IWSLT'17
* Match dataset ordering for multilingual_translation and translation
* Fix bug with LegacyDistributedDataParallel when calling forward of sub-modules
Pull Request resolved: https://github.com/pytorch/fairseq/pull/527
Differential Revision: D14218372
Pulled By: myleott
fbshipit-source-id: 2e3fe24aa39476bcc5c9af68ef9a40192db34a3b
Summary:
Code for the paper: [Mixture Models for Diverse Machine Translation: Tricks of the Trade (Shen et al., 2019)](https://arxiv.org/abs/1902.07816).
Pull Request resolved: https://github.com/pytorch/fairseq/pull/521
Differential Revision: D14188021
Pulled By: myleott
fbshipit-source-id: ed5b1ed5ad9a582359bd5215fa2ea26dc76c673e
Summary:
Changelog:
- `90f52a1`: Support loading subsets of the data on each worker with the `--fix-batches-to-gpus` flag. This should fix#217 and #266.
- `6eda0a9`: Update README for replicating the "Scaling Neural Machine Translation" paper
- `b14c7cf`: Fallback to no_c10d backend for pytorch 0.4.1 (fixes#294)
Pull Request resolved: https://github.com/pytorch/fairseq/pull/295
Differential Revision: D10121559
Pulled By: myleott
fbshipit-source-id: 41c84d0ee4cdd113544b5d3aa38ae8b23acc2c27
A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.
Changes:
- Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
- Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
- Remove LEFT_PAD_* constants and make them configurable per task