Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Go to file
Myle Ott 7633129ba8 Merge internal changes (#283)
Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/283

Pull Request resolved: https://github.com/pytorch/fairseq/pull/428

Differential Revision: D13564190

Pulled By: myleott

fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5
2019-01-04 20:03:19 -08:00
docs Merge internal changes (#283) 2019-01-04 20:03:19 -08:00
examples match examples/stories/writingPrompts scripts to correct folder 2018-10-31 18:12:29 -07:00
fairseq Merge internal changes (#283) 2019-01-04 20:03:19 -08:00
scripts Fix arg formatting in preprocess.py and add fmt control for black formatting (#399) 2018-12-06 13:24:45 -08:00
tests Merge internal changes (#283) 2019-01-04 20:03:19 -08:00
.flake8 fbshipit-source-id: 17992f6a5908f078942544b769eda7a340a5e359 2018-09-30 12:40:34 -07:00
.gitignore Ignore generated files for temporal convolution tbc 2017-10-19 08:12:39 -07:00
.python3 fbshipit-source-id: 17992f6a5908f078942544b769eda7a340a5e359 2018-09-30 12:40:34 -07:00
CONTRIBUTING.md Architecture settings and readme updates 2017-09-15 11:40:28 -07:00
eval_lm.py Merge internal changes (#283) 2019-01-04 20:03:19 -08:00
fairseq.gif Initial commit 2017-09-14 17:22:43 -07:00
fb_train.py Merge internal changes (#283) 2019-01-04 20:03:19 -08:00
generate.py Merge internal changes (#283) 2019-01-04 20:03:19 -08:00
interactive.py Merge internal changes (#283) 2019-01-04 20:03:19 -08:00
LICENSE Initial commit 2017-09-14 17:22:43 -07:00
PATENTS Initial commit 2017-09-14 17:22:43 -07:00
preprocess.py Merge internal changes (#283) 2019-01-04 20:03:19 -08:00
README.md Merge internal changes (#283) 2019-01-04 20:03:19 -08:00
requirements.txt More updates for PyTorch (#114) 2018-03-01 14:04:08 -05:00
score.py Fix arg formatting in preprocess.py and add fmt control for black formatting (#399) 2018-12-06 13:24:45 -08:00
setup.py Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0 2018-09-25 17:36:43 -04:00
train.py Merge internal changes (#283) 2019-01-04 20:03:19 -08:00

Introduction

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. It provides reference implementations of various sequence-to-sequence models, including:

Fairseq features:

  • multi-GPU (distributed) training on one machine or across multiple machines
  • fast generation on both CPU and GPU with multiple search algorithms implemented:
  • large mini-batch training even on a single GPU via delayed updates
  • fast half-precision floating point (FP16) training
  • extensible: easily register new models, criterions, tasks, optimizers and learning rate schedulers

We also provide pre-trained models for several benchmark translation and language modeling datasets.

Model

Requirements and Installation

Currently fairseq requires PyTorch version >= 1.0.0. Please follow the instructions here: https://github.com/pytorch/pytorch#installation.

If you use Docker make sure to increase the shared memory size either with --ipc=host or --shm-size as command line options to nvidia-docker run.

After PyTorch is installed, you can install fairseq with:

pip install -r requirements.txt
python setup.py build develop

Getting Started

The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.

Pre-trained Models

We provide the following pre-trained models and pre-processed, binarized test sets:

Translation

Description Dataset Model Test set(s)
Convolutional
(Gehring et al., 2017)
WMT14 English-French download (.tar.bz2) newstest2014:
download (.tar.bz2)
newstest2012/2013:
download (.tar.bz2)
Convolutional
(Gehring et al., 2017)
WMT14 English-German download (.tar.bz2) newstest2014:
download (.tar.bz2)
Convolutional
(Gehring et al., 2017)
WMT17 English-German download (.tar.bz2) newstest2014:
download (.tar.bz2)
Transformer
(Ott et al., 2018)
WMT14 English-French download (.tar.bz2) newstest2014 (shared vocab):
download (.tar.bz2)
Transformer
(Ott et al., 2018)
WMT16 English-German download (.tar.bz2) newstest2014 (shared vocab):
download (.tar.bz2)
Transformer
(Edunov et al., 2018; WMT'18 winner)
WMT'18 English-German download (.tar.bz2) See NOTE in the archive

Language models

Description Dataset Model Test set(s)
Convolutional
(Dauphin et al., 2017)
Google Billion Words download (.tar.bz2) download (.tar.bz2)
Convolutional
(Dauphin et al., 2017)
WikiText-103 download (.tar.bz2) download (.tar.bz2)

Stories

Description Dataset Model Test set(s)
Stories with Convolutional Model
(Fan et al., 2018)
WritingPrompts download (.tar.bz2) download (.tar.bz2)

Usage

Generation with the binarized test sets can be run in batch mode as follows, e.g. for WMT 2014 English-French on a GTX-1080ti:

$ curl https://s3.amazonaws.com/fairseq-py/models/wmt14.v2.en-fr.fconv-py.tar.bz2 | tar xvjf - -C data-bin
$ curl https://s3.amazonaws.com/fairseq-py/data/wmt14.v2.en-fr.newstest2014.tar.bz2 | tar xvjf - -C data-bin
$ python generate.py data-bin/wmt14.en-fr.newstest2014  \
  --path data-bin/wmt14.en-fr.fconv-py/model.pt \
  --beam 5 --batch-size 128 --remove-bpe | tee /tmp/gen.out
...
| Translated 3003 sentences (96311 tokens) in 166.0s (580.04 tokens/s)
| Generate test with beam=5: BLEU4 = 40.83, 67.5/46.9/34.4/25.5 (BP=1.000, ratio=1.006, syslen=83262, reflen=82787)

# Scoring with score.py:
$ grep ^H /tmp/gen.out | cut -f3- > /tmp/gen.out.sys
$ grep ^T /tmp/gen.out | cut -f2- > /tmp/gen.out.ref
$ python score.py --sys /tmp/gen.out.sys --ref /tmp/gen.out.ref
BLEU4 = 40.83, 67.5/46.9/34.4/25.5 (BP=1.000, ratio=1.006, syslen=83262, reflen=82787)

Join the fairseq community

Citation

If you use the code in your paper, then please cite it as:

@inproceedings{gehring2017convs2s,
  author    = {Gehring, Jonas, and Auli, Michael and Grangier, David and Yarats, Denis and Dauphin, Yann N},
  title     = "{Convolutional Sequence to Sequence Learning}",
  booktitle = {Proc. of ICML},
  year      = 2017,
}

License

fairseq(-py) is BSD-licensed. The license applies to the pre-trained models as well. We also provide an additional patent grant.

Credits

This is a PyTorch version of fairseq, a sequence-to-sequence learning toolkit from Facebook AI Research. The original authors of this reimplementation are (in no particular order) Sergey Edunov, Myle Ott, and Sam Gross.