Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ x ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ x ] Did you make sure to update the docs?
- [ x ] Did you write any new necessary tests?
## What does this PR do?
include wav2vec-u 2.0
!!! TODO !!! update title/link of paper in readme
X-link: https://github.com/fairinternal/fairseq-py/pull/2826
Reviewed By: michaelauli
Differential Revision: D37162174
Pulled By: alexeib
fbshipit-source-id: b985ebb9bb94c25d30b6fc53d8c79088cb9798f9
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/facebookresearch/fairseq/pull/4473
Reviewed By: cbalioglu
Differential Revision: D37052250
Pulled By: dianaml0
fbshipit-source-id: e5e4c38a9108c769953ef2202c7adb8aa335771a
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?
## What does this PR do?
This PR is a cleaned up version of https://github.com/fairinternal/fairseq-py/issues/2138. It is based on the `main` branch instead of the `gshard` branch. Removed call to xFormers MultiHeadDispatch, only using xFormers Attention.
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
X-link: https://github.com/fairinternal/fairseq-py/pull/2263
Reviewed By: blefaudeux
Differential Revision: D33800377
Pulled By: dianaml0
fbshipit-source-id: 658d52214c782212b12881b30c4d908a763b4cf2
Summary:
Our mission at Meta Open Source is to empower communities through open source, and we believe that it means building a welcoming and safe environment for all. As a part of this work, we are adding this banner in support for Ukraine during this crisis.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/4249
Reviewed By: arbabu123
Differential Revision: D34635479
Pulled By: dmitryvinn-fb
fbshipit-source-id: 488d30f0967ae9542ead968c5cb951ecf0e02a64
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Releasing code, model & recipe for the work "Direct speech-to-speech translation with discrete units".
Main changes:
1. examples/speech_to_speech
2. tasks/speech_to_speech
3. data/audio/speech_to_speech_dataset
4. models/speech_to_speech
5. criterions/speech_to_speech_criterion
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2756
Reviewed By: sravyapopuri388, kahne
Differential Revision: D32923969
Pulled By: an918tw
fbshipit-source-id: 838ba42457f4684e9767d15b5b514681a9572b39
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2658
Reviewed By: ngoyal2707
Differential Revision: D32520446
Pulled By: arbabu123
fbshipit-source-id: a4cbc12624c9c8c1b5bc3d64eb47c2fdec01eb87
Summary:
# Before submitting
- [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [x] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes argument for `lr_scheduler.total_num_update`; missing import of `dsprocessor` for COIN; `vmasks` on demo inference; update README.md of fairseq for examples/MMPT.
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2428
Reviewed By: berniebear
Differential Revision: D31528947
Pulled By: howardhsu
fbshipit-source-id: 1fecf34bdab82cbf6001e3905a532e4e6eb38e01
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
zero-shot model release
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2407
Reviewed By: alexeib
Differential Revision: D31417241
Pulled By: xuqiantong
fbshipit-source-id: 576644694638d3b2606f1751b74feb0531b50eb7
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2297
Reviewed By: alexeib
Differential Revision: D30906090
Pulled By: dianaml0
fbshipit-source-id: 941d30db7f766c9077a1b5bb2a04680f57e2e070
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/3879
Reviewed By: myleott
Differential Revision: D30969142
Pulled By: dianaml0
fbshipit-source-id: 902154c03fd68ae6645d3e0ac07b7d729dfc7934
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Release the code for the paper "Discriminative Reranking for Neural Machine Translation"
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2044
Reviewed By: michaelauli
Differential Revision: D29628590
Pulled By: an918tw
fbshipit-source-id: 7a52602d495b736573187cc721829aa545d24770
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1944
Reviewed By: jingfeidu
Differential Revision: D28944206
fbshipit-source-id: 583837f7dd387341574d27dd9acc145455d640a8
Summary:
## What does this PR do?
1. We add an enwiki8 character level LM task sweep for transformer XL, which lands at 1.05 matches the performance (1.06): https://github.com/kimiyoung/transformer-xl/tree/master/pytorch
Eval with
```
PYTHONPATH=. python fairseq_cli/eval_lm.py /private/home/daju/data/enwik8/eos-data-bin/ --path /checkpoint/daju/2020-11-19/enwiki8.transformer_xl.fp16.transformer_xl.adam.cl0.25.cosine.lr0.00025.s2.ngpu4/checkpoint_best.pt --user-dir examples/truncated_bptt/ --task truncated_bptt_lm --batch-size 8 --tokens-per-sample 80 --model-overrides '{"mem_len":2100,"clamp_len":820,"same_length":True}'
```
2. Impalements adaptive span in fairseq code. It reproduces the enwiki8 result at 1.03 comparing to 1.02 (for the 12 L model) reported in https://github.com/facebookresearch/adaptive-span, which is a consistent improvement over the transformer XL baseline listed above with a smaller model.
You can evaluate the example run with:
```
PYTHONPATH=. python fairseq_cli/eval_lm.py /private/home/daju/data/enwik8/eos-data-bin/ --path /checkpoint/daju/2020-11-20/enwiki8.adaptivespan.headwise.adaptive_span.adagrad_with_grad_clip.ag_cl0.03.fixed.wu32000.lr0.07.s2.loss5e-07.ngpu4/checkpoint_best.pt --user-dir examples/truncated_bptt/ --task truncated_bptt_lm --batch-size 8 --tokens-per-sample 512 --gen-subset test
```
Paper: https://arxiv.org/abs/1905.07799
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1428
Reviewed By: myleott
Differential Revision: D25495754
Pulled By: dexterju
fbshipit-source-id: 15a875a5f82d506a4964dea934a374132ce39f8b
Summary:
# Before submitting
- There is no related issue for this pull request.
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [x] Did you make sure to update the docs?
- We did not see any necessity for tests.
## What does this PR do?
Add German RoBERTa model (GottBERT)
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2992
Reviewed By: alexeib
Differential Revision: D25494927
Pulled By: myleott
fbshipit-source-id: b6790124d7c3c8dc387c141706cd8a527cc950ab
Summary:
- Set default value of clip-norm back to 0.0 (disabled)
- Add comment explaining that we divide loss by log(2) to covert the base
- Fix `--zero-optimizer=os` (fixes#2811)
- Update requirements to PyTorch >= 1.5
- Fix bug in fixed LR schedule
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1392
Reviewed By: alexeib
Differential Revision: D24714231
Pulled By: myleott
fbshipit-source-id: 63dc8cfc74683bbccbf05b44228014eb12ddbfc7
Summary:
## What does this PR do?
Implements R3F and R4F coming from Facebook Research: https://arxiv.org/abs/2008.03156
This code was used to generate all the results from the paper excluding probing results.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2455
Reviewed By: myleott
Differential Revision: D23444863
Pulled By: AkshatSh
fbshipit-source-id: b724a6d6cc9cebfdb4bd219828afbb5679f2259b
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Opensource code for Deep Transformer with Latent Depth (https://arxiv.org/pdf/2009.13102.pdf).
New features and design choices made:
- New feature: allow non-residual block to be weighted by sample z (generated per batch) instead of `x = residual + x`.
- Design choice: move `x = residual + x` in transformer_layer.py into a function where the subclass (with latent depth) could overwrite it to `x = residual + z*x`.
- New feature: allow TransformerEncoder or TransformerDecoder to have additional logits parameters which will generate the samples z.
- Design choice: added subclass LatentTransformerEncoder and LatentTransformerDecoder, which has additional attributes for the logits parameters, and instantiate the corresponding LatentTransformerEncoderLayer and LatentTransformerDecoderLayer.
- New feature: allow multilingual_translation task to train with latent depth (results in the paper).
- Design choice:
- added additional arguments in the multilingual_translation task.
- added option for multilingual_transformer to use LatentTransformerEncoder and LatentTransformerDecoder besides standard TransformerEncoder.
- added option in multilingual_translation task's `train_step` to generate the samples z and compute the KL (and sparsity) loss per batch.
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2703
Reviewed By: myleott
Differential Revision: D24155059
Pulled By: xianxl
fbshipit-source-id: f3e41639429f9664ec5565839709aa857a643668
Summary:
# Before submitting
- [N] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [Y] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [Y] Did you make sure to update the docs?
- [N/A] Did you write any new necessary tests?
## What does this PR do?
Add code to reproduce results from Cross-lingual Retrieval for Iterative Self-supervised Training.
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1344
Test Plan:
Imported from GitHub, without a `Test Plan:` line.
See https://github.com/fairinternal/fairseq-py/tree/criss_pr/examples/criss
Reviewed By: myleott
Differential Revision: D24268469
Pulled By: chtran
fbshipit-source-id: d4dd36b22bde3c364ce6e935bd39baf8f96e0735
Summary:
# Before submitting
- [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [x] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?
## What does this PR do?
This PR implements constrained decoding ([Hokamp & Liu, 2017](https://www.aclweb.org/anthology/P17-1141/); [Post & Vilar, 2018](https://www.aclweb.org/anthology/N18-1119/)) with vectorization for batching ([Hu et al., 2019](https://www.aclweb.org/anthology/N19-1090/)). In addition, it add *ordered constraints*, where the constraints are generated on the target side in order, with zero or more unconstrained tokens in between. This variant allows for optimizations that increase speed and BLEU scores (when testing with random scraps from the references).
### Usage and quick start
It works with `fairseq-interactive` via a new command-line option: `fairseq-interactive --constraints [ordered,unordered]`, defaulting to `ordered` if nothing is provided. When active, it will split lines from STDIN on `\t`, with separate constraints each separated by a tab. For example (after downloading the [Fairseq WMT19 German--English model](https://github.com/pytorch/fairseq/blob/master/examples/wmt19/README.md)):
```bash
echo -e "Die maschinelle Übersetzung ist schwer zu kontrollieren.\thard\tinfluence" \
| [normalize.py](https://gist.github.com/mjpost/4c54446b7030d7c64b57461d27090650) \
| [tok.py](https://gist.github.com/mjpost/ed7456f6a987c533102fc121678ed302) \
| PYTHONPATH=$HOME/code/fairseq-constraints fairseq-interactive $modeldir \
--bpe fastbpe \
--bpe-codes $modeldir/bpecodes \
--constraints \
--constraints-both
-s de -t en \
--path $modeldir/model1.pt \
--max-tokens 1000 \
--beam 5 \
```
Adding the `--constraints-both` option causes it to batch-decode the input sentence both with and without the constraints. When run with the Fairseq WMT19 German--English model, the following results are produced (here run on a CPU, don't be alarmed by the times!)
```text
S-0 Die masch@@ in@@ elle Über@@ setzung ist schwer zu kontrollieren .
W-0 1.844 seconds
C-0 hard
C-0 influence
H-0 -1.5333266258239746 Mach@@ ine trans@@ lation is hard to influence .
D-0 -1.5333266258239746 Machine translation is hard to influence .
P-0 -0.5434 -0.1423 -0.1930 -0.1415 -0.2346 -1.8031 -0.1701 -11.7727 -0.1815 -0.1511
S-0 Die masch@@ in@@ elle Über@@ setzung ist schwer zu kontrollieren .
W-0 1.844 seconds
H-0 -0.3731671869754791 Mach@@ ine trans@@ lation is difficult to control .
D-0 -0.3731671869754791 Machine translation is difficult to control .
P-0 -0.5434 -0.1423 -0.1930 -0.1415 -0.2346 -1.1430 -0.1665 -0.8482 -0.1678 -0.1514
2020-07-31 12:17:55 | INFO | fairseq_cli.interactive | Total time: 12.803 seconds; translation time: 3.688
```
Note the new tags present in the output:
* `C-#` records active constraints (after applying preprocessing) for a sentence
* `W-#` reports the sentence-level translation time (a useful unrelated feature I hope you'll accept)
Some unit tests are written (`fairseq/test_constraints.py`) but not yet integrated. Advice here on where to place this is welcome. I also have not run this through lint; if someone can tell me the command to run, I'd appreciate it.
### Implementation notes
This is largely self-contained, implemented in a new `LexicallyConstrainedBeamSearch` class in `search.py`. It does require a few minimal hooks from `_generate()` in `sequence_generator.py`, to ensure that constraints are updated at each timestep. (Edit: most changes in that file are documentation clarifications, corrections, and updates). Unconstrained sentences that are intermingled with constrained ones will not incur any time penalty, so long as they do not occur in the same batch.
Addresses https://github.com/pytorch/fairseq/issues/1536.
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2402
Reviewed By: alexeib
Differential Revision: D23188945
Pulled By: myleott
fbshipit-source-id: 9f5ed855f7a1dcf535b091c0ccf98b07fb9cbdd6
Summary:
Incorporate several fixes, incl. from OSS contributors:
- fix model argument in sequence generator in semisupervised_translation.py
- fix aggregate logging in semisupervised_translation.py
- Fix EOS token in multilingual_denoising
- Handle missing eos_idx in data_utils.collate_tokens
- Better OOM handling for single-GPU training
- fix prepend_bos argument in translation_from_pretrained_bart.py …
- Fix eos_idx in multilingual_denoising
- Small logging fixes
- Fix fb_hub on PyTorch 1.6
- Better variable names
- Add support for model parallel to interactive.py
- Use `//` operator to fix Integer division warning
- Set default `--clip-norm=0.0`
- Cleanup some binaries in root directory
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1196
Reviewed By: ngoyal2707
Differential Revision: D22162202
Pulled By: myleott
fbshipit-source-id: 835b0c0ad9246827f9d915fdb4e89d7b5be2475d
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Add code for published paper from FB
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
*Still WIP*
jmp84
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1707
Reviewed By: jmp84
Differential Revision: D21304498
Pulled By: xutaima
fbshipit-source-id: 073d522e0eeef3e02c83e4617b8e5b697ff6979b
Summary:
FUNCTIONALITY:
This diff provides two core pieces of functionality
- Adds training with quantization noise from "Training with Quantization Noise for Extreme Model Compression" - controlled by the "quant_noise" and "quant_noise_block_size" parameters. Added in embeddings, attention, FFN for BERT and Transformer LM training
- Adds quantization with product quantization based on code from "And the bit goes down: Revisiting the quantization of neural networks" (Stock et al, 2019). This is applied to a fairseq trained model to quantize after training.
TODO:
-> Pierre, look at quantization code
-> int4 and int8 quantization will be added soon.
EVALUATED TEST CASES:
0. Training of LM and BERT models starts from scratch with no errors -> yes
1. Retrain LM from scratch with code, no quantization, reproduces Wikitext-103 LM results -> yes, see /checkpoint/angelafan/qn_open_source_noise
2. Reload previously trained LM from scratch, not trained with quant noise, reproduces Wikitext-103 LM results -> yes
3. Train LM from scratch with code, no trained with quant noise, reproduces Wikitext-103 LM results -> yes, see /checkpoint/angelafan/qn_open_source_baseline
4. Train BERT model from scratch with code, no quantization, training curve looks the same as before -> yes
5. Check wps during training and wps during inference, no large change from before -> yes
6. Check structured dropout isn't being applied at eval time -> yes
7. Works in combination with LayerDrop -> yes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1896
Reviewed By: myleott
Differential Revision: D20609420
Pulled By: huihuifan
fbshipit-source-id: 94468dd811c4caaaef46a9fab2b8d381f9d2b955
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1124
Reviewed By: myleott
Differential Revision: D20749898
fbshipit-source-id: 42bca96d8d65158ae858ceaa7386afedf1696ebb
Summary:
Implemented byte-level BPE described in ["Neural Machine Translation with Byte-Level Subwords"](https://arxiv.org/abs/1909.03341)
* Added bytes/characters/byte-level BPE tokenizers to fairseq.data.encoder
* Added detokenization option to generate.py
* Added an example under examples/byte_level_bpe
* Implemented Transformer model with Bi-GRU embedding contextualization: `examples/byte_level_bpe/gru_transformer.py`
Reviewed By: myleott
Differential Revision: D20600963
fbshipit-source-id: 3eca4d046056c07f65333123416017a4eac04c8a
Summary:
Hi,
this PR updates the link to mBART documentation in main readme.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1789
Differential Revision: D20322673
Pulled By: myleott
fbshipit-source-id: b59c94f49176ba5bbd664791818b5b8ce7402698
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1033
Differential Revision: D20122520
Pulled By: yinhanliu
fbshipit-source-id: e2fd93e2fa9b7a8e276acc4316a176ba3ceae4ed
Summary:
Recent releases of apex removed the `fused_adam_cuda` function used in 3f4fc50163/fairseq/optim/adam.py (L220). Users need to use the `--deprecated_fused_adam` option to isntall `fused_adam_cuda`
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1563
Differential Revision: D19260517
Pulled By: myleott
fbshipit-source-id: 69af015f3ef1fa85b98d138c28876ada194c9437
Summary:
Check locally that everything works fine.
Model is uploaded to fbaipublicfiles.
I fixed a few inconsistencies in the bpe encoding along the way, e.g. related to https://github.com/pytorch/fairseq/issues/1306..
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/904
Reviewed By: ngoyal2707
Differential Revision: D18418345
Pulled By: louismartin
fbshipit-source-id: 53acb4d021581968d70430ee9babee07d6573c17