mirror of https://github.com/facebookresearch/fairseq.git synced 2024-09-19 05:09:20 +03:00

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

artificial-intelligence python pytorch starred-facebookresearch-repo starred-repo

Go to file

Wei Ho 315fa5cbd9 Make error message for trying to train after make_generation_fast work correctly Summary: https://github.com/pytorch/fairseq/blob/master/fairseq/trainer.py#L164 calls `train()` without any argument Reviewed By: myleott Differential Revision: D13599203 fbshipit-source-id: 3a096a6dd35a7a3f8309fbda3b54a36f606475e3		2019-01-09 16:06:15 -08:00
docs	Update docs for --lazy-load and torch.distributed.launch	2019-01-07 15:28:09 -08:00
examples	Fix broken link in README.md (#436 )	2019-01-09 07:35:02 -08:00
fairseq	Make error message for trying to train after make_generation_fast work correctly	2019-01-09 16:06:15 -08:00
scripts	Fix arg formatting in preprocess.py and add fmt control for black formatting (#399 )	2018-12-06 13:24:45 -08:00
tests	Merge internal changes (#283 )	2019-01-04 20:03:19 -08:00
.gitignore	Ignore generated files for temporal convolution tbc	2017-10-19 08:12:39 -07:00
CONTRIBUTING.md	Architecture settings and readme updates	2017-09-15 11:40:28 -07:00
eval_lm.py	Merge internal changes (#283 )	2019-01-04 20:03:19 -08:00
fairseq.gif	Initial commit	2017-09-14 17:22:43 -07:00
generate.py	Merge internal changes (#283 )	2019-01-04 20:03:19 -08:00
interactive.py	Merge internal changes (#283 )	2019-01-04 20:03:19 -08:00
LICENSE	Initial commit	2017-09-14 17:22:43 -07:00
PATENTS	Initial commit	2017-09-14 17:22:43 -07:00
preprocess.py	Merge internal changes (#283 )	2019-01-04 20:03:19 -08:00
README.md	Merge internal changes (#283 )	2019-01-04 20:03:19 -08:00
requirements.txt	More updates for PyTorch (#114 )	2018-03-01 14:04:08 -05:00
score.py	Fix arg formatting in preprocess.py and add fmt control for black formatting (#399 )	2018-12-06 13:24:45 -08:00
setup.py	Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0	2018-09-25 17:36:43 -04:00
train.py	Misc fixes	2019-01-09 08:53:42 -08:00

README.md

Introduction

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. It provides reference implementations of various sequence-to-sequence models, including:

Convolutional Neural Networks (CNN)
Long Short-Term Memory (LSTM) networks
- Luong et al. (2015): Effective Approaches to Attention-based Neural Machine Translation
- Wiseman and Rush (2016): Sequence-to-Sequence Learning as Beam-Search Optimization
Transformer (self-attention) networks

Fairseq features:

multi-GPU (distributed) training on one machine or across multiple machines
fast generation on both CPU and GPU with multiple search algorithms implemented:
- beam search
- Diverse Beam Search (Vijayakumar et al., 2016)
- sampling (unconstrained and top-k)
large mini-batch training even on a single GPU via delayed updates
fast half-precision floating point (FP16) training
extensible: easily register new models, criterions, tasks, optimizers and learning rate schedulers

We also provide pre-trained models for several benchmark translation and language modeling datasets.

Requirements and Installation

A PyTorch installation
For training new models, you'll also need an NVIDIA GPU and NCCL
Python version 3.6

Currently fairseq requires PyTorch version >= 1.0.0. Please follow the instructions here: https://github.com/pytorch/pytorch#installation.

If you use Docker make sure to increase the shared memory size either with --ipc=host or --shm-size as command line options to nvidia-docker run.

After PyTorch is installed, you can install fairseq with:

pip install -r requirements.txt
python setup.py build develop

Getting Started

The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.

Pre-trained Models

We provide the following pre-trained models and pre-processed, binarized test sets:

Translation

Description	Dataset	Model	Test set(s)
Convolutional (Gehring et al., 2017)	WMT14 English-French	download (.tar.bz2)	newstest2014: download (.tar.bz2) newstest2012/2013: download (.tar.bz2)
Convolutional (Gehring et al., 2017)	WMT14 English-German	download (.tar.bz2)	newstest2014: download (.tar.bz2)
Convolutional (Gehring et al., 2017)	WMT17 English-German	download (.tar.bz2)	newstest2014: download (.tar.bz2)
Transformer (Ott et al., 2018)	WMT14 English-French	download (.tar.bz2)	newstest2014 (shared vocab): download (.tar.bz2)
Transformer (Ott et al., 2018)	WMT16 English-German	download (.tar.bz2)	newstest2014 (shared vocab): download (.tar.bz2)
Transformer (Edunov et al., 2018; WMT'18 winner)	WMT'18 English-German	download (.tar.bz2)	See NOTE in the archive

Language models

Description	Dataset	Model	Test set(s)
Convolutional (Dauphin et al., 2017)	Google Billion Words	download (.tar.bz2)	download (.tar.bz2)
Convolutional (Dauphin et al., 2017)	WikiText-103	download (.tar.bz2)	download (.tar.bz2)

Stories

Description	Dataset	Model	Test set(s)
Stories with Convolutional Model (Fan et al., 2018)	WritingPrompts	download (.tar.bz2)	download (.tar.bz2)

Usage

Generation with the binarized test sets can be run in batch mode as follows, e.g. for WMT 2014 English-French on a GTX-1080ti:

$ curl https://s3.amazonaws.com/fairseq-py/models/wmt14.v2.en-fr.fconv-py.tar.bz2 | tar xvjf - -C data-bin
$ curl https://s3.amazonaws.com/fairseq-py/data/wmt14.v2.en-fr.newstest2014.tar.bz2 | tar xvjf - -C data-bin
$ python generate.py data-bin/wmt14.en-fr.newstest2014  \
  --path data-bin/wmt14.en-fr.fconv-py/model.pt \
  --beam 5 --batch-size 128 --remove-bpe | tee /tmp/gen.out
...
| Translated 3003 sentences (96311 tokens) in 166.0s (580.04 tokens/s)
| Generate test with beam=5: BLEU4 = 40.83, 67.5/46.9/34.4/25.5 (BP=1.000, ratio=1.006, syslen=83262, reflen=82787)

# Scoring with score.py:
$ grep ^H /tmp/gen.out | cut -f3- > /tmp/gen.out.sys
$ grep ^T /tmp/gen.out | cut -f2- > /tmp/gen.out.ref
$ python score.py --sys /tmp/gen.out.sys --ref /tmp/gen.out.ref
BLEU4 = 40.83, 67.5/46.9/34.4/25.5 (BP=1.000, ratio=1.006, syslen=83262, reflen=82787)

Join the fairseq community

Facebook page: https://www.facebook.com/groups/fairseq.users
Google group: https://groups.google.com/forum/#!forum/fairseq-users

Citation

If you use the code in your paper, then please cite it as:

@inproceedings{gehring2017convs2s,
  author    = {Gehring, Jonas, and Auli, Michael and Grangier, David and Yarats, Denis and Dauphin, Yann N},
  title     = "{Convolutional Sequence to Sequence Learning}",
  booktitle = {Proc. of ICML},
  year      = 2017,
}

License

fairseq(-py) is BSD-licensed. The license applies to the pre-trained models as well. We also provide an additional patent grant.

Credits

This is a PyTorch version of fairseq, a sequence-to-sequence learning toolkit from Facebook AI Research. The original authors of this reimplementation are (in no particular order) Sergey Edunov, Myle Ott, and Sam Gross.