Commit Graph

5 Commits

Author SHA1 Message Date
Jerry Ma
3f4fc50163 Miscellaneous documentation improvements: (#868)
Summary:
- More clearly document the correspondence between FairseqAdam and torch.optim.AdamW
- Add ResamplingDataset to Sphinx docs
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/868

Differential Revision: D17523244

Pulled By: jma127

fbshipit-source-id: 8e7b34b24889b2c8f70b09a52a625d2af135734b
2019-09-23 12:27:12 -07:00
Myle Ott
fbd4cef9a5 Add fairseq to PyPI (#495)
Summary:
- fairseq can now be installed via pip: `pip install fairseq`
- command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/495

Differential Revision: D14017761

Pulled By: myleott

fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235
2019-02-08 22:03:29 -08:00
Myle Ott
7633129ba8 Merge internal changes (#283)
Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/283

Pull Request resolved: https://github.com/pytorch/fairseq/pull/428

Differential Revision: D13564190

Pulled By: myleott

fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5
2019-01-04 20:03:19 -08:00
Sergey Edunov
1082ba352c Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0
- no more FP16Trainer, we just have an FP16Optimizer wrapper
- most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time
- Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0
- Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq
2018-09-25 17:36:43 -04:00
Myle Ott
6381cc977f Add documentation 2018-09-03 19:15:23 -04:00