mirror of https://github.com/facebookresearch/fairseq.git synced 2024-09-11 17:25:31 +03:00

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

artificial-intelligence python pytorch starred-facebookresearch-repo starred-repo

Go to file

Myle Ott b41c74dc5b Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473 ) Summary: Changelog: - `e330f56`: Add code for the "Pay Less Attention with Lightweight and Dynamic Convolutions" paper - `5e3b98c`: Add scripts for computing tokenized BLEU with compound splitting and sacrebleu - update READMEs - misc fixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/473 Differential Revision: D13819717 Pulled By: myleott fbshipit-source-id: f2dc12ea89a436b950cafec3593ed1b04af808e9		2019-01-25 15:40:26 -08:00
docs	Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473 )	2019-01-25 15:40:26 -08:00
examples	Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473 )	2019-01-25 15:40:26 -08:00
fairseq	Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473 )	2019-01-25 15:40:26 -08:00
scripts	Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473 )	2019-01-25 15:40:26 -08:00
tests	Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473 )	2019-01-25 15:40:26 -08:00
.gitignore	New command line option '--user-dir' (#440 )	2019-01-14 09:04:25 -08:00
CONTRIBUTING.md	Architecture settings and readme updates	2017-09-15 11:40:28 -07:00
eval_lm.py	FIX: '--user-dir' on multi-gpu (#449 )	2019-01-16 12:44:36 -08:00
fairseq_logo.png	Fixes (#442 )	2019-01-14 08:58:51 -08:00
fairseq.gif	Initial commit	2017-09-14 17:22:43 -07:00
generate.py	FIX: '--user-dir' on multi-gpu (#449 )	2019-01-16 12:44:36 -08:00
interactive.py	FIX: '--user-dir' on multi-gpu (#449 )	2019-01-16 12:44:36 -08:00
LICENSE	Initial commit	2017-09-14 17:22:43 -07:00
PATENTS	Initial commit	2017-09-14 17:22:43 -07:00
preprocess.py	Enforce UTF-8 when open() text files (#460 )	2019-01-24 09:39:04 -08:00
README.md	Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473 )	2019-01-25 15:40:26 -08:00
requirements.txt	More updates for PyTorch (#114 )	2018-03-01 14:04:08 -05:00
score.py	Fix arg formatting in preprocess.py and add fmt control for black formatting (#399 )	2018-12-06 13:24:45 -08:00
setup.py	Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0	2018-09-25 17:36:43 -04:00
train.py	Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473 )	2019-01-25 15:40:26 -08:00

README.md

Introduction

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. It provides reference implementations of various sequence-to-sequence models, including:

Convolutional Neural Networks (CNN)
LightConv and DynamicConv models
- New Wu et al. (2019): Pay Less Attention with Lightweight and Dynamic Convolutions
Long Short-Term Memory (LSTM) networks
- Luong et al. (2015): Effective Approaches to Attention-based Neural Machine Translation
- Wiseman and Rush (2016): Sequence-to-Sequence Learning as Beam-Search Optimization
Transformer (self-attention) networks

Fairseq features:

multi-GPU (distributed) training on one machine or across multiple machines
fast generation on both CPU and GPU with multiple search algorithms implemented:
- beam search
- Diverse Beam Search (Vijayakumar et al., 2016)
- sampling (unconstrained and top-k)
large mini-batch training even on a single GPU via delayed updates
fast half-precision floating point (FP16) training
extensible: easily register new models, criterions, tasks, optimizers and learning rate schedulers

We also provide pre-trained models for several benchmark translation and language modeling datasets.

Requirements and Installation

A PyTorch installation
For training new models, you'll also need an NVIDIA GPU and NCCL
Python version 3.6

Currently fairseq requires PyTorch version >= 1.0.0. Please follow the instructions here: https://github.com/pytorch/pytorch#installation.

If you use Docker make sure to increase the shared memory size either with --ipc=host or --shm-size as command line options to nvidia-docker run.

After PyTorch is installed, you can install fairseq with:

pip install -r requirements.txt
python setup.py build develop

Getting Started

The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.

Pre-trained models and examples

We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below, as well as example training and evaluation commands.

Translation: convolutional and transformer models are available
Language Modeling: convolutional models are available

We also have more detailed READMEs to reproduce results from specific papers:

Join the fairseq community

Facebook page: https://www.facebook.com/groups/fairseq.users
Google group: https://groups.google.com/forum/#!forum/fairseq-users

License

fairseq(-py) is BSD-licensed. The license applies to the pre-trained models as well. We also provide an additional patent grant.

Credits

This is a PyTorch version of fairseq, a sequence-to-sequence learning toolkit from Facebook AI Research. The original authors of this reimplementation are (in no particular order) Sergey Edunov, Myle Ott, and Sam Gross.