Commit Graph

16 Commits

Author SHA1 Message Date
Myle Ott
881381cfc7 v0.7.1: fix PyPI setup and tests
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/818

Differential Revision: D15916265

Pulled By: myleott

fbshipit-source-id: c66c0bd988d3472c4150226952f34ee8d4c3db86
2019-06-20 06:28:37 -07:00
Myle Ott
bd710e75ae v0.7.0 (#817)
Summary:
Notable (possibly breaking) changes:
- d45db80: Remove checkpoint utility functions from utils.py into checkpoint_utils.py
- f2563c2: Move LM definitions into separate files
- dffb167: Updates to model API:
  - `FairseqModel` -> `FairseqEncoderDecoderModel`
  - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer`
  - `encoder_out_dict` -> `encoder_out`
  - rm unused `remove_head` functions
- 34726d5: Move `distributed_init` into `DistributedFairseqModel`
- cf17068: Simplify distributed launch by automatically launching multiprocessing on each node for all visible GPUs (allows launching just one job per node instead of one per GPU)
- d45db80: Change default LR scheduler from `reduce_lr_on_plateau` to `fixed`
- 96ac28d: Rename `--sampling-temperature` -> `--temperature`
- fc1a19a: Deprecate dummy batches
- a1c997b: Add memory mapped datasets
- 0add50c: Allow cycling over multiple datasets, where each one becomes an "epoch"

Plus many additional features and bugfixes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/817

Differential Revision: D15913844

Pulled By: myleott

fbshipit-source-id: d5b5d678efdd9dd3e4d7ca848ddcf1ec2b21bf6b
2019-06-19 19:08:50 -07:00
Myle Ott
dffb167449 Updates to model API (#561)
Summary:
- `FairseqModel` -> `FairseqEncoderDecoderModel`
- add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer`
- `encoder_out_dict` -> `encoder_out`
- rm unused `remove_head` functions
- update docs
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/561

Differential Revision: D15271142

Pulled By: myleott

fbshipit-source-id: 8e8864e399336020f0271c780598e968ff51a264
2019-05-15 07:12:41 -07:00
zhiqiang
d0577ba7a5 Fix option in docs (#735)
Summary:
`--output-format` -> `--dataset-impl` in Tutorial: Classifying Names with a Character-Level RNN
Pull Request resolved: https://github.com/pytorch/fairseq/pull/735

Differential Revision: D15314625

Pulled By: myleott

fbshipit-source-id: 65b8efd1a367ca754e5b9dca088aefbc648864dd
2019-05-12 16:37:59 -07:00
Myle Ott
d45db80431 Merge internal changes (#654)
Summary:
- Add --add-bos-token option to LM task
- Cleanup utils.py and options.py
Pull Request resolved: https://github.com/pytorch/fairseq/pull/654

Differential Revision: D15041794

Pulled By: myleott

fbshipit-source-id: 3ad00007769d5f48308052cfd40de39c5ffa1a6e
2019-04-29 19:50:58 -07:00
Myle Ott
e6422528da 0.6.1 -> 0.6.2 (#577)
Summary:
Changelog:
- 998ba4f: Add language models from Baevski & Auli (2018)
- 4294c4f: Add mixture of experts code from Shen et al. (2019)
- 0049349: Add example for multilingual training
- 48d9afb: Speed improvements, including fused operators from apex
- 44d27e6: Add Tensorboard support
- d17fa85: Add Adadelta optimizer
- 9e1c880: Add `FairseqEncoderModel`
- b65c579: Add `FairseqTask.inference_step` to modularize generate.py
- 2ad1178: Add back `--curriculum`
- Misc bug fixes and other features

Pull Request resolved: https://github.com/pytorch/fairseq/pull/577

Differential Revision: D14481233

Pulled By: myleott

fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7
2019-03-15 10:27:01 -07:00
Vladimir Karpukhin
f296824f40 Move string line encoding logic from tokenizer to Dictionary (unified diff). (#541)
Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/541

Just a combo of a stacked pair D14057943 & D14176011,
Made this as a separete diff cause there seems to be some issue with porting a stacked change into github repo

Differential Revision: D14251048

fbshipit-source-id: 0a47f534a69d6ab2ebe035fba40fd51748cccfb8
2019-02-28 09:19:12 -08:00
Myle Ott
fbd4cef9a5 Add fairseq to PyPI (#495)
Summary:
- fairseq can now be installed via pip: `pip install fairseq`
- command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/495

Differential Revision: D14017761

Pulled By: myleott

fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235
2019-02-08 22:03:29 -08:00
Myle Ott
b41c74dc5b Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473)
Summary:
Changelog:
- `e330f56`: Add code for the "Pay Less Attention with Lightweight and Dynamic Convolutions" paper
- `5e3b98c`: Add scripts for computing tokenized BLEU with compound splitting and sacrebleu
- update READMEs
- misc fixes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/473

Differential Revision: D13819717

Pulled By: myleott

fbshipit-source-id: f2dc12ea89a436b950cafec3593ed1b04af808e9
2019-01-25 15:40:26 -08:00
Davide Caroselli
ebaf8c5030 '--user-dir' documentation (correct) (#447)
Summary:
Command line option --user-dir documented in docs/overview.rst
Pull Request resolved: https://github.com/pytorch/fairseq/pull/447

Differential Revision: D13674744

Pulled By: myleott

fbshipit-source-id: 17049ee5c9f692f5298ef9fa7381ee583f269cde
2019-01-15 11:54:17 -08:00
Myle Ott
14bd9c62a3 Update docs for --lazy-load and torch.distributed.launch
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/433

Differential Revision: D13588032

Pulled By: myleott

fbshipit-source-id: 0e5ff361e27b206c4490264f0f51863367499e81
2019-01-07 15:28:09 -08:00
Myle Ott
7633129ba8 Merge internal changes (#283)
Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/283

Pull Request resolved: https://github.com/pytorch/fairseq/pull/428

Differential Revision: D13564190

Pulled By: myleott

fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5
2019-01-04 20:03:19 -08:00
Sergey Edunov
1082ba352c Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0
- no more FP16Trainer, we just have an FP16Optimizer wrapper
- most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time
- Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0
- Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq
2018-09-25 17:36:43 -04:00
Sergey Edunov
fe2d1581a4 Fix docs 2018-09-17 22:34:17 -07:00
Myle Ott
4a47b88992 Update documentation 2018-09-03 20:03:37 -04:00
Myle Ott
6381cc977f Add documentation 2018-09-03 19:15:23 -04:00