fairseq

mirror of https://github.com/facebookresearch/fairseq.git synced 2024-10-04 04:37:58 +03:00

Author	SHA1	Message	Date
Myle Ott	be3515b289	More fully deprecate --raw-text and --lazy-load (fixes #1488 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/947 Differential Revision: D19084273 Pulled By: myleott fbshipit-source-id: de80d9abfac8e3d813a9c9b343b41327c500344e	2019-12-16 17:22:11 -08:00
Myle Ott	df2f84ce61	v0.8.0 -> v0.9.0 (#1452 ) Summary: Possibly breaking changes: - Set global numpy seed (`4a7cd58`) - Split `in_proj_weight` into separate k, v, q projections in MultiheadAttention (`fdf4c3e`) - TransformerEncoder returns namedtuples instead of dict (`27568a7`) New features: - Add `--fast-stat-sync` option (`e1ba32a`) - Add `--empty-cache-freq` option (`315c463`) - Support criterions with parameters (`ba5f829`) New papers: - Simple and Effective Noisy Channel Modeling for Neural Machine Translation (`49177c9`) - Levenshtein Transformer (`86857a5`, ...) - Cross+Self-Attention for Transformer Models (`4ac2c5f`) - Jointly Learning to Align and Translate with Transformer Models (`1c66792`) - Reducing Transformer Depth on Demand with Structured Dropout (`dabbef4`) - Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (`e23e5ea`) - BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (`a92bcda`) - CamemBERT: a French BERT (`b31849a`) Speed improvements: - Add CUDA kernels for LightConv and DynamicConv (`f840564`) - Cythonization of various dataloading components (`4fc3953`, ...) - Don't project mask tokens for MLM training (`718677e`) Pull Request resolved: https://github.com/pytorch/fairseq/pull/1452 Differential Revision: D18798409 Pulled By: myleott fbshipit-source-id: 860a0d5aaf7377c8c9bd63cdb3b33d464f0e1727	2019-12-03 15:19:33 -08:00
Kevin	13d9e2baf8	Fix changes of file locations of subword-nmt (#1219 ) Summary: Solves https://github.com/pytorch/fairseq/issues/1218. Pull Request resolved: https://github.com/pytorch/fairseq/pull/1219 Differential Revision: D18339541 Pulled By: myleott fbshipit-source-id: 6d5bd7b60fa7fd30c038fdad54591343a01f228b	2019-11-07 09:08:29 -08:00
Myle Ott	a0f75996b1	Fix building of docs Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1340 Differential Revision: D18289455 Pulled By: myleott fbshipit-source-id: a1c8163a35273b6c646d300142701e8a317d7378	2019-11-02 16:52:50 -07:00
Zhanghao Wu	2314979ea5	Update getting_started.rst (#1188 ) Summary: Hi, I think there is a minor mistake in the doc. `--distributed-no-spawn` argument is needed for distributed training on multiple machines without `slurm`. Otherwise, the program will start 8 jobs on each GPU, when `nproc_per_node=8`. Pull Request resolved: https://github.com/pytorch/fairseq/pull/1188 Differential Revision: D17627778 Pulled By: myleott fbshipit-source-id: 35ab6b650dc1132d7cb2d150e80d2ebf0caf3e69	2019-09-27 07:27:28 -07:00
Jerry Ma	3f4fc50163	Miscellaneous documentation improvements: (#868 ) Summary: - More clearly document the correspondence between FairseqAdam and torch.optim.AdamW - Add ResamplingDataset to Sphinx docs Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/868 Differential Revision: D17523244 Pulled By: jma127 fbshipit-source-id: 8e7b34b24889b2c8f70b09a52a625d2af135734b	2019-09-23 12:27:12 -07:00
Myle Ott	ffffe04ea1	v0.7.2 -> v0.8.0 (#1017 ) Summary: Changelog: - Relicensed under MIT license - Add RoBERTa - Add wav2vec - Add WMT'19 models - Add initial ASR code - Changed torch.hub interface (`generate` renamed to `translate`) - Add `--tokenizer` and `--bpe` - `f812e52`: Renamed data.transforms -> data.encoders - `654affc`: New Dataset API (optional) - `47fd985`: Deprecate old Masked LM components - `5f78106`: Set mmap as default dataset format and infer format automatically - Misc fixes for sampling - Misc fixes to support PyTorch 1.2 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1017 Differential Revision: D16799880 Pulled By: myleott fbshipit-source-id: 45ad8bc531724a53063cbc24ca1c93f715cdc5a7	2019-08-14 05:02:45 -07:00
Myle Ott	8835d93cf0	Standardize on 'teacher forcing' rather than 'input feeding' which is… (#769 ) Summary: Input feeding generally refers to a slightly different concept Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/769 Differential Revision: D16491898 Pulled By: myleott fbshipit-source-id: 68573584e820f11f199db4e7e37e9ee7a69a3287	2019-07-25 07:24:07 -07:00
Myle Ott	8af5554269	Improve interactive generation (support --tokenizer and --bpe) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/734 Differential Revision: D16377044 Pulled By: myleott fbshipit-source-id: 37d5553d76aa7c653113fec089f59710281c31d7	2019-07-19 06:45:18 -07:00
Myle Ott	b002d0096e	v0.7.1 -> v0.7.2 (#891 ) Summary: No major API changes since the last release. Cutting a new release since we'll be merging significant (possibly breaking) changes to logging, data loading and the masked LM implementation soon. Pull Request resolved: https://github.com/pytorch/fairseq/pull/891 Differential Revision: D16377132 Pulled By: myleott fbshipit-source-id: f1cb88e671ccd510e53334d0f449fe18585268c7	2019-07-19 06:33:40 -07:00
Myle Ott	881381cfc7	v0.7.1: fix PyPI setup and tests Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/818 Differential Revision: D15916265 Pulled By: myleott fbshipit-source-id: c66c0bd988d3472c4150226952f34ee8d4c3db86	2019-06-20 06:28:37 -07:00
Myle Ott	bd710e75ae	v0.7.0 (#817 ) Summary: Notable (possibly breaking) changes: - `d45db80`: Remove checkpoint utility functions from utils.py into checkpoint_utils.py - `f2563c2`: Move LM definitions into separate files - `dffb167`: Updates to model API: - `FairseqModel` -> `FairseqEncoderDecoderModel` - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer` - `encoder_out_dict` -> `encoder_out` - rm unused `remove_head` functions - `34726d5`: Move `distributed_init` into `DistributedFairseqModel` - `cf17068`: Simplify distributed launch by automatically launching multiprocessing on each node for all visible GPUs (allows launching just one job per node instead of one per GPU) - `d45db80`: Change default LR scheduler from `reduce_lr_on_plateau` to `fixed` - `96ac28d`: Rename `--sampling-temperature` -> `--temperature` - `fc1a19a`: Deprecate dummy batches - `a1c997b`: Add memory mapped datasets - `0add50c`: Allow cycling over multiple datasets, where each one becomes an "epoch" Plus many additional features and bugfixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/817 Differential Revision: D15913844 Pulled By: myleott fbshipit-source-id: d5b5d678efdd9dd3e4d7ca848ddcf1ec2b21bf6b	2019-06-19 19:08:50 -07:00
Myle Ott	dffb167449	Updates to model API (#561 ) Summary: - `FairseqModel` -> `FairseqEncoderDecoderModel` - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer` - `encoder_out_dict` -> `encoder_out` - rm unused `remove_head` functions - update docs Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/561 Differential Revision: D15271142 Pulled By: myleott fbshipit-source-id: 8e8864e399336020f0271c780598e968ff51a264	2019-05-15 07:12:41 -07:00
zhiqiang	d0577ba7a5	Fix option in docs (#735 ) Summary: `--output-format` -> `--dataset-impl` in Tutorial: Classifying Names with a Character-Level RNN Pull Request resolved: https://github.com/pytorch/fairseq/pull/735 Differential Revision: D15314625 Pulled By: myleott fbshipit-source-id: 65b8efd1a367ca754e5b9dca088aefbc648864dd	2019-05-12 16:37:59 -07:00
Myle Ott	d45db80431	Merge internal changes (#654 ) Summary: - Add --add-bos-token option to LM task - Cleanup utils.py and options.py Pull Request resolved: https://github.com/pytorch/fairseq/pull/654 Differential Revision: D15041794 Pulled By: myleott fbshipit-source-id: 3ad00007769d5f48308052cfd40de39c5ffa1a6e	2019-04-29 19:50:58 -07:00
Myle Ott	e6422528da	0.6.1 -> 0.6.2 (#577 ) Summary: Changelog: - `998ba4f`: Add language models from Baevski & Auli (2018) - `4294c4f`: Add mixture of experts code from Shen et al. (2019) - `0049349`: Add example for multilingual training - `48d9afb`: Speed improvements, including fused operators from apex - `44d27e6`: Add Tensorboard support - `d17fa85`: Add Adadelta optimizer - `9e1c880`: Add `FairseqEncoderModel` - `b65c579`: Add `FairseqTask.inference_step` to modularize generate.py - `2ad1178`: Add back `--curriculum` - Misc bug fixes and other features Pull Request resolved: https://github.com/pytorch/fairseq/pull/577 Differential Revision: D14481233 Pulled By: myleott fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7	2019-03-15 10:27:01 -07:00
Vladimir Karpukhin	f296824f40	Move string line encoding logic from tokenizer to Dictionary (unified diff). (#541 ) Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/541 Just a combo of a stacked pair D14057943 & D14176011, Made this as a separete diff cause there seems to be some issue with porting a stacked change into github repo Differential Revision: D14251048 fbshipit-source-id: 0a47f534a69d6ab2ebe035fba40fd51748cccfb8	2019-02-28 09:19:12 -08:00
Myle Ott	fbd4cef9a5	Add fairseq to PyPI (#495 ) Summary: - fairseq can now be installed via pip: `pip install fairseq` - command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc. Pull Request resolved: https://github.com/pytorch/fairseq/pull/495 Differential Revision: D14017761 Pulled By: myleott fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235	2019-02-08 22:03:29 -08:00
Myle Ott	b41c74dc5b	Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473 ) Summary: Changelog: - `e330f56`: Add code for the "Pay Less Attention with Lightweight and Dynamic Convolutions" paper - `5e3b98c`: Add scripts for computing tokenized BLEU with compound splitting and sacrebleu - update READMEs - misc fixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/473 Differential Revision: D13819717 Pulled By: myleott fbshipit-source-id: f2dc12ea89a436b950cafec3593ed1b04af808e9	2019-01-25 15:40:26 -08:00
Davide Caroselli	ebaf8c5030	'--user-dir' documentation (correct) (#447 ) Summary: Command line option --user-dir documented in docs/overview.rst Pull Request resolved: https://github.com/pytorch/fairseq/pull/447 Differential Revision: D13674744 Pulled By: myleott fbshipit-source-id: 17049ee5c9f692f5298ef9fa7381ee583f269cde	2019-01-15 11:54:17 -08:00
Myle Ott	14bd9c62a3	Update docs for --lazy-load and torch.distributed.launch Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/433 Differential Revision: D13588032 Pulled By: myleott fbshipit-source-id: 0e5ff361e27b206c4490264f0f51863367499e81	2019-01-07 15:28:09 -08:00
Myle Ott	7633129ba8	Merge internal changes (#283 ) Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/283 Pull Request resolved: https://github.com/pytorch/fairseq/pull/428 Differential Revision: D13564190 Pulled By: myleott fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5	2019-01-04 20:03:19 -08:00
Sergey Edunov	1082ba352c	Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0 - no more FP16Trainer, we just have an FP16Optimizer wrapper - most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time - Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0 - Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq	2018-09-25 17:36:43 -04:00
Sergey Edunov	fe2d1581a4	Fix docs	2018-09-17 22:34:17 -07:00
Myle Ott	4a47b88992	Update documentation	2018-09-03 20:03:37 -04:00
Myle Ott	6381cc977f	Add documentation	2018-09-03 19:15:23 -04:00

26 Commits