Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Go to file
Dmytro Okhonko d17fa85135 Adadelta optimizer
Summary: Adding Adadelta optimizer to fairseq as wrapper around torch.optim.Adadelta

Reviewed By: myleott

Differential Revision: D14418635

fbshipit-source-id: 6bf5ec008e905a4a2cbf7415e9492f5eea3ff07f
2019-03-12 15:12:21 -07:00
docs Move string line encoding logic from tokenizer to Dictionary (unified diff). (#541) 2019-02-28 09:19:12 -08:00
examples Add test for mixture of experts 2019-02-28 08:56:24 -08:00
fairseq Adadelta optimizer 2019-03-12 15:12:21 -07:00
fairseq_cli Add fairseq to PyPI (#495) 2019-02-08 22:03:29 -08:00
scripts Multilingual training example (#527) 2019-02-25 18:46:10 -08:00
tests Adadelta optimizer 2019-03-12 15:12:21 -07:00
.gitignore ignore data files in .gitignore 2019-02-28 17:46:49 -08:00
CONTRIBUTING.md Architecture settings and readme updates 2017-09-15 11:40:28 -07:00
eval_lm.py Support LM generation from interactive.py (fixes #526) 2019-02-25 19:06:05 -08:00
fairseq_logo.png Fixes (#442) 2019-01-14 08:58:51 -08:00
fairseq.gif Initial commit 2017-09-14 17:22:43 -07:00
generate.py Create fairseq_cli_lib 2019-03-11 14:19:22 -07:00
interactive.py Move string line encoding logic from tokenizer to Dictionary (unified diff). (#541) 2019-02-28 09:19:12 -08:00
LICENSE Initial commit 2017-09-14 17:22:43 -07:00
PATENTS Initial commit 2017-09-14 17:22:43 -07:00
preprocess.py Use --workers for validation sets in preprocess.py 2019-03-01 13:12:59 -08:00
README.md Update README for Mixture of Experts paper 2019-02-22 16:34:45 -08:00
score.py Move string line encoding logic from tokenizer to Dictionary (unified diff). (#541) 2019-02-28 09:19:12 -08:00
setup.py Add sacrebleu to requirements 2019-02-28 07:54:28 -08:00
train.py Add missing parentheses in regex expression (#567) 2019-03-11 07:30:07 -07:00

Introduction

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. It provides reference implementations of various sequence-to-sequence models, including:

Fairseq features:

  • multi-GPU (distributed) training on one machine or across multiple machines
  • fast generation on both CPU and GPU with multiple search algorithms implemented:
  • large mini-batch training even on a single GPU via delayed updates
  • fast half-precision floating point (FP16) training
  • extensible: easily register new models, criterions, tasks, optimizers and learning rate schedulers

We also provide pre-trained models for several benchmark translation and language modeling datasets.

Model

Requirements and Installation

Currently fairseq requires PyTorch version >= 1.0.0. Please follow the instructions here: https://github.com/pytorch/pytorch#installation.

If you use Docker make sure to increase the shared memory size either with --ipc=host or --shm-size as command line options to nvidia-docker run.

After PyTorch is installed, you can install fairseq with pip:

pip install fairseq

Installing from source

To install fairseq from source and develop locally:

git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable .

Getting Started

The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.

Pre-trained models and examples

We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below, as well as example training and evaluation commands.

We also have more detailed READMEs to reproduce results from specific papers:

Join the fairseq community

License

fairseq(-py) is BSD-licensed. The license applies to the pre-trained models as well. We also provide an additional patent grant.

Credits

This is a PyTorch version of fairseq, a sequence-to-sequence learning toolkit from Facebook AI Research. The original authors of this reimplementation are (in no particular order) Sergey Edunov, Myle Ott, and Sam Gross.