Commit Graph

47 Commits

Author SHA1 Message Date
Guillaume Wenzek
699ab19014
run all tests (#4733)
* run all tests

* make torch a build-time dependency

* add 'dev' extra deps to install black, flake, pytest at once

* Build docs in CI

This should also help catch some import bugs, since sphinx inspect a lot of code

* CI should do the real install not "--editable"

* check installation succeeded

* add missing __init__.py file

* add check installation

* move check_installation.py to its own script

* fix pytest import mode, force recent numpy, torch

* run black before flake and tests

* torch >= 1.10.0

* use torch 1.10  for GPU tests
2022-09-23 18:40:50 +02:00
Sebastian Vincent
71a21dfb65
closes #4549 (#4550) 2022-07-07 07:38:48 -04:00
Diana Liskovich
5adfeaccf9 Rename references from master -> main in preparation for branch name change (#2297)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2297

Reviewed By: alexeib

Differential Revision: D30906090

Pulled By: dianaml0

fbshipit-source-id: 941d30db7f766c9077a1b5bb2a04680f57e2e070
2021-09-20 08:29:38 -07:00
Pierre Andrews
4cf7d76114 Hydra Integration doc should refer to non legacy task (#1619)
Summary:
# Before submitting

- [NO] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [YES] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [YES] Did you make sure to update the docs?
- [NO] Did you write any new necessary tests?

## What does this PR do?

This is a typo fix to the Hydra Integration doc where the example with dataclass config should user `FairseqTask` and not `LegacyFairseqTask`.

Didn't make an issue for this as it's a trivial doc change for the example to match the actual doc.

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1619

Reviewed By: huihuifan

Differential Revision: D26448855

Pulled By: Mortimerp9

fbshipit-source-id: 467323101b8425370f6bd7c0532e70abb319b337
2021-02-20 06:27:14 -08:00
Myle Ott
72a25a4e52 Rename optimization.min_lr -> optimization.stop_min_lr (#1486)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1486

Test Plan: Imported from OSS

Reviewed By: alexeib

Differential Revision: D25342181

Pulled By: myleott

fbshipit-source-id: 7d1cfb26334fff26d688648724ab073e5fb956f5
2020-12-05 07:37:51 -08:00
Myle Ott
4df4d0af8d Add missing --optimizer option to tutorial docs (fixes #2830) (#1485)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1485

Test Plan: Imported from OSS

Reviewed By: alexeib

Differential Revision: D25342182

Pulled By: myleott

fbshipit-source-id: 7eb2a4b2b7377da31d4f538053cc196437532db0
2020-12-05 07:37:50 -08:00
alexeib
f13f299093 fix issubclass() call on python 3.7+ (#1462)
Summary:
Fixes #2897

Also updates readmes to use --config-dir instead of --config-path for hydra runs, and adds __init__.py to config dir

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1462

Reviewed By: myleott

Differential Revision: D25163789

Pulled By: alexeib

fbshipit-source-id: f45f432174771c5c458480f984aedf12130b8522
2020-11-23 19:08:39 -08:00
Myle Ott
3b77a61600 Add fairseq-hydra-train and update docs (#1449)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1449

Test Plan: Imported from OSS

Reviewed By: alexeib

Differential Revision: D25094525

Pulled By: myleott

fbshipit-source-id: 430387d11196d3292933bb168cf09ea16ebc0d3b
2020-11-20 06:00:59 -08:00
alexeib
bd2e804b9c add and link hydra docs (#1405)
Summary:
updates hydra integration doc and links to it

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1405

Reviewed By: myleott

Differential Revision: D24808779

Pulled By: alexeib

fbshipit-source-id: a50160e196e469e30e39d6ee47440a569c0154bd
2020-11-07 17:25:18 -08:00
Myle Ott
1bc83c703a Misc fixes (#2786)
Summary:
- Rename type -> key in fairseq/tasks/sentence_prediction.py (fixes https://github.com/pytorch/fairseq/issues/2746)
- Update preprocessing docs (fixes https://github.com/pytorch/fairseq/issues/2565)
- Turn off logging in test_fp16_optimizer.TestGradientScaling
- Documentation updates
- Remove some unused code
- Fix noisychannel example (fixes https://github.com/pytorch/fairseq/issues/2213)

Pull Request resolved: https://github.com/pytorch/fairseq/pull/2786

Reviewed By: shruti-bh

Differential Revision: D24515146

Pulled By: myleott

fbshipit-source-id: 86b0f5516c57610fdca801c60e58158ef052fc3a
2020-10-27 11:26:07 -07:00
Myle Ott
e0737c3c29 Dynamically generate versions based on commit hash (#2774)
Summary:
This will produce version strings like `1.0.0a0+3065963`, similar to PyTorch version strings.

Pull Request resolved: https://github.com/pytorch/fairseq/pull/2774

Reviewed By: alexeib

Differential Revision: D24453517

Pulled By: myleott

fbshipit-source-id: 03a0c324ed6124bbc513ba7edc954abd71d63a0f
2020-10-22 12:51:04 -07:00
alexeib
3b27ed7996 Enable Hydra configs in fairseq (#1343) (#1510)
Summary:
Pull Request resolved: https://github.com/facebookresearch/pytext/pull/1510

this is the main pr that switches on hydra functionality in fairseq

we migrate "args" object into omegaconf "DictConfig" at all legacy entry points

in addition this migrates various components from secondary registries (like bpe encoders and tokenizers) to make the migration smoother

i am going through code that references migrated fairseq components and changing it to inherit from "Legacy*" components instead. hopefully tests will catch most of this

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1343

Reviewed By: myleott

Differential Revision: D23973928

Pulled By: alexeib

fbshipit-source-id: dd9554981fff51ea75c1ff343874d1d6e61793c9
2020-10-20 00:32:26 -07:00
Myle Ott
a48f235636 Apply black+isort (#1357)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1357

Reviewed By: alexeib

Differential Revision: D24377772

fbshipit-source-id: 51581af041d42d62166b33a35a1a4228b1a76f0c
2020-10-18 18:14:51 -07:00
Mu Tian
42c5dcbd18 hydra fairseq 3 - inherit from legacy for fairseq classes
Summary: hydra fairseq 3 - inherit from legacy for fairseq classes

Reviewed By: alexeib

Differential Revision: D23375457

fbshipit-source-id: ef9d19f2d02f2326eea44a70f1f6e1668b420840
2020-09-09 17:02:13 -07:00
Myle Ott
1cc8e95cec Don't cache epoch iterators when using sharded datasets (#1268)
Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1268

We previously had a memory leak when using sharded datasets. In particular,
each sharded dataset is a new FairseqDataset instance, and the cache is keyed
by the `dataset` instance. Since we never clear the cache, this would
eventually cause the system to run out of CPU RAM.

This diff disables caching when using sharded datasets.

Note that we also change the signature to `get_batch_iterator`, which needs to
propagate to many places. We previously avoided this update when adding
`data_buffer_size`, so I'm also adding that everywhere.

Reviewed By: ngoyal2707

Differential Revision: D23319135

fbshipit-source-id: 6bcd6aee141ad9cc234448c49106a8dbf8ea1800
2020-09-09 06:20:31 -07:00
Myle Ott
adbd89fd4b Misc fixes (#2492)
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/2492

Reviewed By: ngoyal2707

Differential Revision: D23177728

Pulled By: myleott

fbshipit-source-id: 32424f61cab57f759f87e16e8d5144d3eed5ae36
2020-08-20 06:42:10 -07:00
Myle Ott
9831634946 Misc fixes (#2448)
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/2448

Reviewed By: ngoyal2707

Differential Revision: D23011193

Pulled By: myleott

fbshipit-source-id: 1a29481707108e4465aca78ec1581fb79f05efba
2020-08-14 10:24:51 -07:00
Myle Ott
ffecb4e349 Small fixes (#1215)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1215

Reviewed By: ngoyal2707, msbaines

Differential Revision: D22514719

Pulled By: myleott

fbshipit-source-id: 5f15ba501fd66af1eb49b5702aff940f06c3d91f
2020-07-14 14:17:13 -07:00
Myle Ott
f0a61a2774 Miscellaneous fixes (#1196)
Summary:
Incorporate several fixes, incl. from OSS contributors:
- fix model argument in sequence generator in semisupervised_translation.py
- fix aggregate logging in semisupervised_translation.py
- Fix EOS token in multilingual_denoising
- Handle missing eos_idx in data_utils.collate_tokens
- Better OOM handling for single-GPU training
- fix prepend_bos argument in translation_from_pretrained_bart.py …
- Fix eos_idx in multilingual_denoising
- Small logging fixes
- Fix fb_hub on PyTorch 1.6
- Better variable names
- Add support for model parallel to interactive.py
- Use `//` operator to fix Integer division warning
- Set default `--clip-norm=0.0`
- Cleanup some binaries in root directory

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1196

Reviewed By: ngoyal2707

Differential Revision: D22162202

Pulled By: myleott

fbshipit-source-id: 835b0c0ad9246827f9d915fdb4e89d7b5be2475d
2020-06-24 10:08:53 -07:00
Myle Ott
da94e58c70 TPU support for Translation (#2245)
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/2245

Reviewed By: ngoyal2707

Differential Revision: D22070745

Pulled By: myleott

fbshipit-source-id: e43a96a585366b10d997a12522e8cd6496294ad2
2020-06-24 09:56:42 -07:00
alexeib
3335de5f44 add vq-wav2vec (#1029)
Summary:
sanitized vq-wav2vec implementation. i will also add docs to this. i have a fixed-up checkpoint that this code can load and verified that it produces same results as what we used in paper
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1029

Differential Revision: D20129246

Pulled By: alexeib

fbshipit-source-id: f72f455e0c309168e644ab86ec18c768c308da98
2020-02-29 18:25:34 -08:00
Myle Ott
be3515b289 More fully deprecate --raw-text and --lazy-load (fixes #1488)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/947

Differential Revision: D19084273

Pulled By: myleott

fbshipit-source-id: de80d9abfac8e3d813a9c9b343b41327c500344e
2019-12-16 17:22:11 -08:00
Myle Ott
df2f84ce61 v0.8.0 -> v0.9.0 (#1452)
Summary:
Possibly breaking changes:
- Set global numpy seed (4a7cd58)
- Split `in_proj_weight` into separate k, v, q projections in MultiheadAttention (fdf4c3e)
- TransformerEncoder returns namedtuples instead of dict (27568a7)

New features:
- Add `--fast-stat-sync` option (e1ba32a)
- Add `--empty-cache-freq` option (315c463)
- Support criterions with parameters (ba5f829)

New papers:
- Simple and Effective Noisy Channel Modeling for Neural Machine Translation (49177c9)
- Levenshtein Transformer (86857a5, ...)
- Cross+Self-Attention for Transformer Models (4ac2c5f)
- Jointly Learning to Align and Translate with Transformer Models (1c66792)
- Reducing Transformer Depth on Demand with Structured Dropout (dabbef4)
- Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (e23e5ea)
- BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (a92bcda)
- CamemBERT: a French BERT (b31849a)

Speed improvements:
- Add CUDA kernels for LightConv and DynamicConv (f840564)
- Cythonization of various dataloading components (4fc3953, ...)
- Don't project mask tokens for MLM training (718677e)
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1452

Differential Revision: D18798409

Pulled By: myleott

fbshipit-source-id: 860a0d5aaf7377c8c9bd63cdb3b33d464f0e1727
2019-12-03 15:19:33 -08:00
Kevin
13d9e2baf8 Fix changes of file locations of subword-nmt (#1219)
Summary:
Solves https://github.com/pytorch/fairseq/issues/1218.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1219

Differential Revision: D18339541

Pulled By: myleott

fbshipit-source-id: 6d5bd7b60fa7fd30c038fdad54591343a01f228b
2019-11-07 09:08:29 -08:00
Myle Ott
a0f75996b1 Fix building of docs
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1340

Differential Revision: D18289455

Pulled By: myleott

fbshipit-source-id: a1c8163a35273b6c646d300142701e8a317d7378
2019-11-02 16:52:50 -07:00
Zhanghao Wu
2314979ea5 Update getting_started.rst (#1188)
Summary:
Hi,

I think there is a minor mistake in the doc. `--distributed-no-spawn` argument is needed for distributed training on multiple machines without `slurm`. Otherwise, the program will start 8 jobs on each GPU, when `nproc_per_node=8`.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1188

Differential Revision: D17627778

Pulled By: myleott

fbshipit-source-id: 35ab6b650dc1132d7cb2d150e80d2ebf0caf3e69
2019-09-27 07:27:28 -07:00
Jerry Ma
3f4fc50163 Miscellaneous documentation improvements: (#868)
Summary:
- More clearly document the correspondence between FairseqAdam and torch.optim.AdamW
- Add ResamplingDataset to Sphinx docs
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/868

Differential Revision: D17523244

Pulled By: jma127

fbshipit-source-id: 8e7b34b24889b2c8f70b09a52a625d2af135734b
2019-09-23 12:27:12 -07:00
Myle Ott
ffffe04ea1 v0.7.2 -> v0.8.0 (#1017)
Summary:
Changelog:
- Relicensed under MIT license
- Add RoBERTa
- Add wav2vec
- Add WMT'19 models
- Add initial ASR code
- Changed torch.hub interface (`generate` renamed to `translate`)
- Add `--tokenizer` and `--bpe`
- f812e52: Renamed data.transforms -> data.encoders
- 654affc: New Dataset API (optional)
- `47fd985`: Deprecate old Masked LM components
- `5f78106`: Set mmap as default dataset format and infer format automatically
- Misc fixes for sampling
- Misc fixes to support PyTorch 1.2
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1017

Differential Revision: D16799880

Pulled By: myleott

fbshipit-source-id: 45ad8bc531724a53063cbc24ca1c93f715cdc5a7
2019-08-14 05:02:45 -07:00
Myle Ott
8835d93cf0 Standardize on 'teacher forcing' rather than 'input feeding' which is… (#769)
Summary:
Input feeding generally refers to a slightly different concept
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/769

Differential Revision: D16491898

Pulled By: myleott

fbshipit-source-id: 68573584e820f11f199db4e7e37e9ee7a69a3287
2019-07-25 07:24:07 -07:00
Myle Ott
8af5554269 Improve interactive generation (support --tokenizer and --bpe)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/734

Differential Revision: D16377044

Pulled By: myleott

fbshipit-source-id: 37d5553d76aa7c653113fec089f59710281c31d7
2019-07-19 06:45:18 -07:00
Myle Ott
b002d0096e v0.7.1 -> v0.7.2 (#891)
Summary:
No major API changes since the last release. Cutting a new release since we'll be merging significant (possibly breaking) changes to logging, data loading and the masked LM implementation soon.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/891

Differential Revision: D16377132

Pulled By: myleott

fbshipit-source-id: f1cb88e671ccd510e53334d0f449fe18585268c7
2019-07-19 06:33:40 -07:00
Myle Ott
881381cfc7 v0.7.1: fix PyPI setup and tests
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/818

Differential Revision: D15916265

Pulled By: myleott

fbshipit-source-id: c66c0bd988d3472c4150226952f34ee8d4c3db86
2019-06-20 06:28:37 -07:00
Myle Ott
bd710e75ae v0.7.0 (#817)
Summary:
Notable (possibly breaking) changes:
- d45db80: Remove checkpoint utility functions from utils.py into checkpoint_utils.py
- f2563c2: Move LM definitions into separate files
- dffb167: Updates to model API:
  - `FairseqModel` -> `FairseqEncoderDecoderModel`
  - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer`
  - `encoder_out_dict` -> `encoder_out`
  - rm unused `remove_head` functions
- 34726d5: Move `distributed_init` into `DistributedFairseqModel`
- cf17068: Simplify distributed launch by automatically launching multiprocessing on each node for all visible GPUs (allows launching just one job per node instead of one per GPU)
- d45db80: Change default LR scheduler from `reduce_lr_on_plateau` to `fixed`
- 96ac28d: Rename `--sampling-temperature` -> `--temperature`
- fc1a19a: Deprecate dummy batches
- a1c997b: Add memory mapped datasets
- 0add50c: Allow cycling over multiple datasets, where each one becomes an "epoch"

Plus many additional features and bugfixes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/817

Differential Revision: D15913844

Pulled By: myleott

fbshipit-source-id: d5b5d678efdd9dd3e4d7ca848ddcf1ec2b21bf6b
2019-06-19 19:08:50 -07:00
Myle Ott
dffb167449 Updates to model API (#561)
Summary:
- `FairseqModel` -> `FairseqEncoderDecoderModel`
- add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer`
- `encoder_out_dict` -> `encoder_out`
- rm unused `remove_head` functions
- update docs
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/561

Differential Revision: D15271142

Pulled By: myleott

fbshipit-source-id: 8e8864e399336020f0271c780598e968ff51a264
2019-05-15 07:12:41 -07:00
zhiqiang
d0577ba7a5 Fix option in docs (#735)
Summary:
`--output-format` -> `--dataset-impl` in Tutorial: Classifying Names with a Character-Level RNN
Pull Request resolved: https://github.com/pytorch/fairseq/pull/735

Differential Revision: D15314625

Pulled By: myleott

fbshipit-source-id: 65b8efd1a367ca754e5b9dca088aefbc648864dd
2019-05-12 16:37:59 -07:00
Myle Ott
d45db80431 Merge internal changes (#654)
Summary:
- Add --add-bos-token option to LM task
- Cleanup utils.py and options.py
Pull Request resolved: https://github.com/pytorch/fairseq/pull/654

Differential Revision: D15041794

Pulled By: myleott

fbshipit-source-id: 3ad00007769d5f48308052cfd40de39c5ffa1a6e
2019-04-29 19:50:58 -07:00
Myle Ott
e6422528da 0.6.1 -> 0.6.2 (#577)
Summary:
Changelog:
- 998ba4f: Add language models from Baevski & Auli (2018)
- 4294c4f: Add mixture of experts code from Shen et al. (2019)
- 0049349: Add example for multilingual training
- 48d9afb: Speed improvements, including fused operators from apex
- 44d27e6: Add Tensorboard support
- d17fa85: Add Adadelta optimizer
- 9e1c880: Add `FairseqEncoderModel`
- b65c579: Add `FairseqTask.inference_step` to modularize generate.py
- 2ad1178: Add back `--curriculum`
- Misc bug fixes and other features

Pull Request resolved: https://github.com/pytorch/fairseq/pull/577

Differential Revision: D14481233

Pulled By: myleott

fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7
2019-03-15 10:27:01 -07:00
Vladimir Karpukhin
f296824f40 Move string line encoding logic from tokenizer to Dictionary (unified diff). (#541)
Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/541

Just a combo of a stacked pair D14057943 & D14176011,
Made this as a separete diff cause there seems to be some issue with porting a stacked change into github repo

Differential Revision: D14251048

fbshipit-source-id: 0a47f534a69d6ab2ebe035fba40fd51748cccfb8
2019-02-28 09:19:12 -08:00
Myle Ott
fbd4cef9a5 Add fairseq to PyPI (#495)
Summary:
- fairseq can now be installed via pip: `pip install fairseq`
- command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/495

Differential Revision: D14017761

Pulled By: myleott

fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235
2019-02-08 22:03:29 -08:00
Myle Ott
b41c74dc5b Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473)
Summary:
Changelog:
- `e330f56`: Add code for the "Pay Less Attention with Lightweight and Dynamic Convolutions" paper
- `5e3b98c`: Add scripts for computing tokenized BLEU with compound splitting and sacrebleu
- update READMEs
- misc fixes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/473

Differential Revision: D13819717

Pulled By: myleott

fbshipit-source-id: f2dc12ea89a436b950cafec3593ed1b04af808e9
2019-01-25 15:40:26 -08:00
Davide Caroselli
ebaf8c5030 '--user-dir' documentation (correct) (#447)
Summary:
Command line option --user-dir documented in docs/overview.rst
Pull Request resolved: https://github.com/pytorch/fairseq/pull/447

Differential Revision: D13674744

Pulled By: myleott

fbshipit-source-id: 17049ee5c9f692f5298ef9fa7381ee583f269cde
2019-01-15 11:54:17 -08:00
Myle Ott
14bd9c62a3 Update docs for --lazy-load and torch.distributed.launch
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/433

Differential Revision: D13588032

Pulled By: myleott

fbshipit-source-id: 0e5ff361e27b206c4490264f0f51863367499e81
2019-01-07 15:28:09 -08:00
Myle Ott
7633129ba8 Merge internal changes (#283)
Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/283

Pull Request resolved: https://github.com/pytorch/fairseq/pull/428

Differential Revision: D13564190

Pulled By: myleott

fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5
2019-01-04 20:03:19 -08:00
Sergey Edunov
1082ba352c Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0
- no more FP16Trainer, we just have an FP16Optimizer wrapper
- most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time
- Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0
- Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq
2018-09-25 17:36:43 -04:00
Sergey Edunov
fe2d1581a4 Fix docs 2018-09-17 22:34:17 -07:00
Myle Ott
4a47b88992 Update documentation 2018-09-03 20:03:37 -04:00
Myle Ott
6381cc977f Add documentation 2018-09-03 19:15:23 -04:00