fairseq

mirror of https://github.com/facebookresearch/fairseq.git synced 2024-10-27 01:41:27 +03:00

Author	SHA1	Message	Date
Myle Ott	a48f235636	Apply black+isort (#1357 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1357 Reviewed By: alexeib Differential Revision: D24377772 fbshipit-source-id: 51581af041d42d62166b33a35a1a4228b1a76f0c	2020-10-18 18:14:51 -07:00
Mu Tian	42c5dcbd18	hydra fairseq 3 - inherit from legacy for fairseq classes Summary: hydra fairseq 3 - inherit from legacy for fairseq classes Reviewed By: alexeib Differential Revision: D23375457 fbshipit-source-id: ef9d19f2d02f2326eea44a70f1f6e1668b420840	2020-09-09 17:02:13 -07:00
Myle Ott	1cc8e95cec	Don't cache epoch iterators when using sharded datasets (#1268 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1268 We previously had a memory leak when using sharded datasets. In particular, each sharded dataset is a new FairseqDataset instance, and the cache is keyed by the `dataset` instance. Since we never clear the cache, this would eventually cause the system to run out of CPU RAM. This diff disables caching when using sharded datasets. Note that we also change the signature to `get_batch_iterator`, which needs to propagate to many places. We previously avoided this update when adding `data_buffer_size`, so I'm also adding that everywhere. Reviewed By: ngoyal2707 Differential Revision: D23319135 fbshipit-source-id: 6bcd6aee141ad9cc234448c49106a8dbf8ea1800	2020-09-09 06:20:31 -07:00
Myle Ott	adbd89fd4b	Misc fixes (#2492 ) Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/2492 Reviewed By: ngoyal2707 Differential Revision: D23177728 Pulled By: myleott fbshipit-source-id: 32424f61cab57f759f87e16e8d5144d3eed5ae36	2020-08-20 06:42:10 -07:00
Myle Ott	9831634946	Misc fixes (#2448 ) Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/2448 Reviewed By: ngoyal2707 Differential Revision: D23011193 Pulled By: myleott fbshipit-source-id: 1a29481707108e4465aca78ec1581fb79f05efba	2020-08-14 10:24:51 -07:00
Myle Ott	ffecb4e349	Small fixes (#1215 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1215 Reviewed By: ngoyal2707, msbaines Differential Revision: D22514719 Pulled By: myleott fbshipit-source-id: 5f15ba501fd66af1eb49b5702aff940f06c3d91f	2020-07-14 14:17:13 -07:00
Myle Ott	f0a61a2774	Miscellaneous fixes (#1196 ) Summary: Incorporate several fixes, incl. from OSS contributors: - fix model argument in sequence generator in semisupervised_translation.py - fix aggregate logging in semisupervised_translation.py - Fix EOS token in multilingual_denoising - Handle missing eos_idx in data_utils.collate_tokens - Better OOM handling for single-GPU training - fix prepend_bos argument in translation_from_pretrained_bart.py … - Fix eos_idx in multilingual_denoising - Small logging fixes - Fix fb_hub on PyTorch 1.6 - Better variable names - Add support for model parallel to interactive.py - Use `//` operator to fix Integer division warning - Set default `--clip-norm=0.0` - Cleanup some binaries in root directory Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1196 Reviewed By: ngoyal2707 Differential Revision: D22162202 Pulled By: myleott fbshipit-source-id: 835b0c0ad9246827f9d915fdb4e89d7b5be2475d	2020-06-24 10:08:53 -07:00
Myle Ott	da94e58c70	TPU support for Translation (#2245 ) Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/2245 Reviewed By: ngoyal2707 Differential Revision: D22070745 Pulled By: myleott fbshipit-source-id: e43a96a585366b10d997a12522e8cd6496294ad2	2020-06-24 09:56:42 -07:00
alexeib	3335de5f44	add vq-wav2vec (#1029 ) Summary: sanitized vq-wav2vec implementation. i will also add docs to this. i have a fixed-up checkpoint that this code can load and verified that it produces same results as what we used in paper Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1029 Differential Revision: D20129246 Pulled By: alexeib fbshipit-source-id: f72f455e0c309168e644ab86ec18c768c308da98	2020-02-29 18:25:34 -08:00
Myle Ott	be3515b289	More fully deprecate --raw-text and --lazy-load (fixes #1488 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/947 Differential Revision: D19084273 Pulled By: myleott fbshipit-source-id: de80d9abfac8e3d813a9c9b343b41327c500344e	2019-12-16 17:22:11 -08:00
Myle Ott	df2f84ce61	v0.8.0 -> v0.9.0 (#1452 ) Summary: Possibly breaking changes: - Set global numpy seed (`4a7cd58`) - Split `in_proj_weight` into separate k, v, q projections in MultiheadAttention (`fdf4c3e`) - TransformerEncoder returns namedtuples instead of dict (`27568a7`) New features: - Add `--fast-stat-sync` option (`e1ba32a`) - Add `--empty-cache-freq` option (`315c463`) - Support criterions with parameters (`ba5f829`) New papers: - Simple and Effective Noisy Channel Modeling for Neural Machine Translation (`49177c9`) - Levenshtein Transformer (`86857a5`, ...) - Cross+Self-Attention for Transformer Models (`4ac2c5f`) - Jointly Learning to Align and Translate with Transformer Models (`1c66792`) - Reducing Transformer Depth on Demand with Structured Dropout (`dabbef4`) - Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (`e23e5ea`) - BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (`a92bcda`) - CamemBERT: a French BERT (`b31849a`) Speed improvements: - Add CUDA kernels for LightConv and DynamicConv (`f840564`) - Cythonization of various dataloading components (`4fc3953`, ...) - Don't project mask tokens for MLM training (`718677e`) Pull Request resolved: https://github.com/pytorch/fairseq/pull/1452 Differential Revision: D18798409 Pulled By: myleott fbshipit-source-id: 860a0d5aaf7377c8c9bd63cdb3b33d464f0e1727	2019-12-03 15:19:33 -08:00
Kevin	13d9e2baf8	Fix changes of file locations of subword-nmt (#1219 ) Summary: Solves https://github.com/pytorch/fairseq/issues/1218. Pull Request resolved: https://github.com/pytorch/fairseq/pull/1219 Differential Revision: D18339541 Pulled By: myleott fbshipit-source-id: 6d5bd7b60fa7fd30c038fdad54591343a01f228b	2019-11-07 09:08:29 -08:00
Myle Ott	a0f75996b1	Fix building of docs Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1340 Differential Revision: D18289455 Pulled By: myleott fbshipit-source-id: a1c8163a35273b6c646d300142701e8a317d7378	2019-11-02 16:52:50 -07:00
Zhanghao Wu	2314979ea5	Update getting_started.rst (#1188 ) Summary: Hi, I think there is a minor mistake in the doc. `--distributed-no-spawn` argument is needed for distributed training on multiple machines without `slurm`. Otherwise, the program will start 8 jobs on each GPU, when `nproc_per_node=8`. Pull Request resolved: https://github.com/pytorch/fairseq/pull/1188 Differential Revision: D17627778 Pulled By: myleott fbshipit-source-id: 35ab6b650dc1132d7cb2d150e80d2ebf0caf3e69	2019-09-27 07:27:28 -07:00
Jerry Ma	3f4fc50163	Miscellaneous documentation improvements: (#868 ) Summary: - More clearly document the correspondence between FairseqAdam and torch.optim.AdamW - Add ResamplingDataset to Sphinx docs Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/868 Differential Revision: D17523244 Pulled By: jma127 fbshipit-source-id: 8e7b34b24889b2c8f70b09a52a625d2af135734b	2019-09-23 12:27:12 -07:00
Myle Ott	ffffe04ea1	v0.7.2 -> v0.8.0 (#1017 ) Summary: Changelog: - Relicensed under MIT license - Add RoBERTa - Add wav2vec - Add WMT'19 models - Add initial ASR code - Changed torch.hub interface (`generate` renamed to `translate`) - Add `--tokenizer` and `--bpe` - `f812e52`: Renamed data.transforms -> data.encoders - `654affc`: New Dataset API (optional) - `47fd985`: Deprecate old Masked LM components - `5f78106`: Set mmap as default dataset format and infer format automatically - Misc fixes for sampling - Misc fixes to support PyTorch 1.2 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1017 Differential Revision: D16799880 Pulled By: myleott fbshipit-source-id: 45ad8bc531724a53063cbc24ca1c93f715cdc5a7	2019-08-14 05:02:45 -07:00
Myle Ott	8835d93cf0	Standardize on 'teacher forcing' rather than 'input feeding' which is… (#769 ) Summary: Input feeding generally refers to a slightly different concept Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/769 Differential Revision: D16491898 Pulled By: myleott fbshipit-source-id: 68573584e820f11f199db4e7e37e9ee7a69a3287	2019-07-25 07:24:07 -07:00
Myle Ott	8af5554269	Improve interactive generation (support --tokenizer and --bpe) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/734 Differential Revision: D16377044 Pulled By: myleott fbshipit-source-id: 37d5553d76aa7c653113fec089f59710281c31d7	2019-07-19 06:45:18 -07:00
Myle Ott	b002d0096e	v0.7.1 -> v0.7.2 (#891 ) Summary: No major API changes since the last release. Cutting a new release since we'll be merging significant (possibly breaking) changes to logging, data loading and the masked LM implementation soon. Pull Request resolved: https://github.com/pytorch/fairseq/pull/891 Differential Revision: D16377132 Pulled By: myleott fbshipit-source-id: f1cb88e671ccd510e53334d0f449fe18585268c7	2019-07-19 06:33:40 -07:00
Myle Ott	881381cfc7	v0.7.1: fix PyPI setup and tests Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/818 Differential Revision: D15916265 Pulled By: myleott fbshipit-source-id: c66c0bd988d3472c4150226952f34ee8d4c3db86	2019-06-20 06:28:37 -07:00
Myle Ott	bd710e75ae	v0.7.0 (#817 ) Summary: Notable (possibly breaking) changes: - `d45db80`: Remove checkpoint utility functions from utils.py into checkpoint_utils.py - `f2563c2`: Move LM definitions into separate files - `dffb167`: Updates to model API: - `FairseqModel` -> `FairseqEncoderDecoderModel` - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer` - `encoder_out_dict` -> `encoder_out` - rm unused `remove_head` functions - `34726d5`: Move `distributed_init` into `DistributedFairseqModel` - `cf17068`: Simplify distributed launch by automatically launching multiprocessing on each node for all visible GPUs (allows launching just one job per node instead of one per GPU) - `d45db80`: Change default LR scheduler from `reduce_lr_on_plateau` to `fixed` - `96ac28d`: Rename `--sampling-temperature` -> `--temperature` - `fc1a19a`: Deprecate dummy batches - `a1c997b`: Add memory mapped datasets - `0add50c`: Allow cycling over multiple datasets, where each one becomes an "epoch" Plus many additional features and bugfixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/817 Differential Revision: D15913844 Pulled By: myleott fbshipit-source-id: d5b5d678efdd9dd3e4d7ca848ddcf1ec2b21bf6b	2019-06-19 19:08:50 -07:00
Myle Ott	dffb167449	Updates to model API (#561 ) Summary: - `FairseqModel` -> `FairseqEncoderDecoderModel` - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer` - `encoder_out_dict` -> `encoder_out` - rm unused `remove_head` functions - update docs Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/561 Differential Revision: D15271142 Pulled By: myleott fbshipit-source-id: 8e8864e399336020f0271c780598e968ff51a264	2019-05-15 07:12:41 -07:00
zhiqiang	d0577ba7a5	Fix option in docs (#735 ) Summary: `--output-format` -> `--dataset-impl` in Tutorial: Classifying Names with a Character-Level RNN Pull Request resolved: https://github.com/pytorch/fairseq/pull/735 Differential Revision: D15314625 Pulled By: myleott fbshipit-source-id: 65b8efd1a367ca754e5b9dca088aefbc648864dd	2019-05-12 16:37:59 -07:00
Myle Ott	d45db80431	Merge internal changes (#654 ) Summary: - Add --add-bos-token option to LM task - Cleanup utils.py and options.py Pull Request resolved: https://github.com/pytorch/fairseq/pull/654 Differential Revision: D15041794 Pulled By: myleott fbshipit-source-id: 3ad00007769d5f48308052cfd40de39c5ffa1a6e	2019-04-29 19:50:58 -07:00
Myle Ott	e6422528da	0.6.1 -> 0.6.2 (#577 ) Summary: Changelog: - `998ba4f`: Add language models from Baevski & Auli (2018) - `4294c4f`: Add mixture of experts code from Shen et al. (2019) - `0049349`: Add example for multilingual training - `48d9afb`: Speed improvements, including fused operators from apex - `44d27e6`: Add Tensorboard support - `d17fa85`: Add Adadelta optimizer - `9e1c880`: Add `FairseqEncoderModel` - `b65c579`: Add `FairseqTask.inference_step` to modularize generate.py - `2ad1178`: Add back `--curriculum` - Misc bug fixes and other features Pull Request resolved: https://github.com/pytorch/fairseq/pull/577 Differential Revision: D14481233 Pulled By: myleott fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7	2019-03-15 10:27:01 -07:00
Vladimir Karpukhin	f296824f40	Move string line encoding logic from tokenizer to Dictionary (unified diff). (#541 ) Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/541 Just a combo of a stacked pair D14057943 & D14176011, Made this as a separete diff cause there seems to be some issue with porting a stacked change into github repo Differential Revision: D14251048 fbshipit-source-id: 0a47f534a69d6ab2ebe035fba40fd51748cccfb8	2019-02-28 09:19:12 -08:00
Myle Ott	fbd4cef9a5	Add fairseq to PyPI (#495 ) Summary: - fairseq can now be installed via pip: `pip install fairseq` - command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc. Pull Request resolved: https://github.com/pytorch/fairseq/pull/495 Differential Revision: D14017761 Pulled By: myleott fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235	2019-02-08 22:03:29 -08:00
Myle Ott	b41c74dc5b	Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473 ) Summary: Changelog: - `e330f56`: Add code for the "Pay Less Attention with Lightweight and Dynamic Convolutions" paper - `5e3b98c`: Add scripts for computing tokenized BLEU with compound splitting and sacrebleu - update READMEs - misc fixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/473 Differential Revision: D13819717 Pulled By: myleott fbshipit-source-id: f2dc12ea89a436b950cafec3593ed1b04af808e9	2019-01-25 15:40:26 -08:00
Davide Caroselli	ebaf8c5030	'--user-dir' documentation (correct) (#447 ) Summary: Command line option --user-dir documented in docs/overview.rst Pull Request resolved: https://github.com/pytorch/fairseq/pull/447 Differential Revision: D13674744 Pulled By: myleott fbshipit-source-id: 17049ee5c9f692f5298ef9fa7381ee583f269cde	2019-01-15 11:54:17 -08:00
Myle Ott	14bd9c62a3	Update docs for --lazy-load and torch.distributed.launch Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/433 Differential Revision: D13588032 Pulled By: myleott fbshipit-source-id: 0e5ff361e27b206c4490264f0f51863367499e81	2019-01-07 15:28:09 -08:00
Myle Ott	7633129ba8	Merge internal changes (#283 ) Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/283 Pull Request resolved: https://github.com/pytorch/fairseq/pull/428 Differential Revision: D13564190 Pulled By: myleott fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5	2019-01-04 20:03:19 -08:00
Sergey Edunov	1082ba352c	Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0 - no more FP16Trainer, we just have an FP16Optimizer wrapper - most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time - Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0 - Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq	2018-09-25 17:36:43 -04:00
Sergey Edunov	fe2d1581a4	Fix docs	2018-09-17 22:34:17 -07:00
Myle Ott	4a47b88992	Update documentation	2018-09-03 20:03:37 -04:00
Myle Ott	6381cc977f	Add documentation	2018-09-03 19:15:23 -04:00

35 Commits