mirror of https://github.com/facebookresearch/fairseq.git synced 2024-09-11 17:25:31 +03:00

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

artificial-intelligence python pytorch starred-facebookresearch-repo starred-repo

Go to file

Myle Ott 27568a7ebe Merge TracingCompliantTransformer and regular Transformer, fix NAT tests Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/899 Differential Revision: D18373060 Pulled By: myleott fbshipit-source-id: bb5510ec15799a0a10a7c0669e76d8200e1ba479		2019-11-13 09:12:13 -08:00
docs	Fix changes of file locations of subword-nmt (#1219 )	2019-11-07 09:08:29 -08:00
examples	Camembert model and code (#904 )	2019-11-10 11:29:07 -08:00
fairseq	Merge TracingCompliantTransformer and regular Transformer, fix NAT tests	2019-11-13 09:12:13 -08:00
fairseq_cli	Add fairseq to PyPI (#495 )	2019-02-08 22:03:29 -08:00
scripts	Small fixes	2019-08-19 15:08:25 -07:00
tests	Merge TracingCompliantTransformer and regular Transformer, fix NAT tests	2019-11-13 09:12:13 -08:00
.gitignore	Add autogenerated cython files to gitignore (#860 )	2019-09-18 15:58:38 -07:00
CODE_OF_CONDUCT.md	Adopt Contributor Covenant	2019-08-29 23:24:43 -07:00
CONTRIBUTING.md	Relicense fairseq under MIT license (#786 )	2019-07-30 07:48:23 -07:00
eval_lm.py	Small fixes	2019-08-19 15:08:25 -07:00
fairseq_logo.png	Fixes (#442 )	2019-01-14 08:58:51 -08:00
fairseq.gif	Initial commit	2017-09-14 17:22:43 -07:00
generate.py	Implementation of the paper "Jointly Learning to Align and Translate with Transformer Models" (#877 )	2019-09-30 06:57:32 -07:00
hubconf.py	Minor cleanup for setup.py	2019-08-27 10:07:40 -07:00
interactive.py	Implementation of the paper "Jointly Learning to Align and Translate with Transformer Models" (#877 )	2019-09-30 06:57:32 -07:00
LICENSE	Relicense fairseq under MIT license (#786 )	2019-07-30 07:48:23 -07:00
preprocess.py	Implementation of the paper "Jointly Learning to Align and Translate with Transformer Models" (#877 )	2019-09-30 06:57:32 -07:00
README.md	Camembert model and code (#904 )	2019-11-10 11:29:07 -08:00
score.py	Relicense fairseq under MIT license (#786 )	2019-07-30 07:48:23 -07:00
setup.py	Fix building of docs	2019-11-02 16:52:50 -07:00
train.py	Move fb_pathmgr registration out of train.py	2019-11-08 04:42:45 -08:00
validate.py	Small fixes	2019-08-19 15:08:25 -07:00

README.md

Introduction

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks.

What's New:

November 2019: CamemBERT model and code released
November 2019: BART model and code released
November 2019: XLM-R models and code released
September 2019: Nonautoregressive translation code released
August 2019: WMT'19 models released
July 2019: fairseq relicensed under MIT license
July 2019: RoBERTa models and code released
June 2019: wav2vec models and code released

Features:

Fairseq provides reference implementations of various sequence-to-sequence models, including:

Convolutional Neural Networks (CNN)
LightConv and DynamicConv models
- Pay Less Attention with Lightweight and Dynamic Convolutions (Wu et al., 2019)
Long Short-Term Memory (LSTM) networks
- Effective Approaches to Attention-based Neural Machine Translation (Luong et al., 2015)
Transformer (self-attention) networks
Non-autoregressive Transformers
- Non-Autoregressive Neural Machine Translation (Gu et al., 2017)
- Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement (Lee et al. 2018)
- Insertion Transformer: Flexible Sequence Generation via Insertion Operations (Stern et al. 2019)
- Mask-Predict: Parallel Decoding of Conditional Masked Language Models (Ghazvininejad et al., 2019)
- Levenshtein Transformer (Gu et al., 2019)

Additionally:

multi-GPU (distributed) training on one machine or across multiple machines
fast generation on both CPU and GPU with multiple search algorithms implemented:
- beam search
- Diverse Beam Search (Vijayakumar et al., 2016)
- sampling (unconstrained, top-k and top-p/nucleus)
large mini-batch training even on a single GPU via delayed updates
mixed precision training (trains faster with less GPU memory on NVIDIA tensor cores)
extensible: easily register new models, criterions, tasks, optimizers and learning rate schedulers

We also provide pre-trained models for several benchmark translation and language modeling datasets.

Requirements and Installation

PyTorch version >= 1.2.0
Python version >= 3.5
For training new models, you'll also need an NVIDIA GPU and NCCL
For faster training install NVIDIA's apex library with the --cuda_ext option

To install fairseq:

pip install fairseq

On MacOS:

CFLAGS="-stdlib=libc++" pip install fairseq

If you use Docker make sure to increase the shared memory size either with --ipc=host or --shm-size as command line options to nvidia-docker run.

Installing from source

To install fairseq from source and develop locally:

git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable .

Getting Started

The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.

Pre-trained models and examples

We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below, as well as example training and evaluation commands.

Translation: convolutional and transformer models are available
Language Modeling: convolutional and transformer models are available
wav2vec: wav2vec large model is available

We also have more detailed READMEs to reproduce results from specific papers:

Join the fairseq community

Facebook page: https://www.facebook.com/groups/fairseq.users
Google group: https://groups.google.com/forum/#!forum/fairseq-users

License

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Citation

Please cite as:

@inproceedings{ott2019fairseq,
  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
}