fairseq

mirror of https://github.com/facebookresearch/fairseq.git synced 2024-09-11 17:25:31 +03:00

Author	SHA1	Message	Date
Myle Ott	9b0611e678	Fix torch.hub (fixes #2756 ) (#2762 ) Summary: Typically `torch.hub.load(...)` doesn't call `pip install`, so our Cython components never get built. We have a hack in our hubconf that builds these components by running the equivalent of `python setup.py build_ext --inplace` using the setuptools sandbox: `f6677b6755/hubconf.py (L52-L55)`. Unfortunately, this sandbox gets mad if you modify the filesystem, which is what this recent change does: `f6677b6755/setup.py (L203-L205)`. Combined this breaks torch.hub. The solution is that when we're doing `build_ext`, don't setup the symlinks. This is fine, since `build_ext` doesn't actually build a package, so we don't care about including config or examples. Pull Request resolved: https://github.com/pytorch/fairseq/pull/2762 Reviewed By: alexeib Differential Revision: D24430228 Pulled By: myleott fbshipit-source-id: e05d075a003ddfde196cb8a86b32882d73808015	2020-10-20 15:46:55 -07:00
Myle Ott	9b8b464070	Package config and examples with fairseq (#1356 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1356 Reviewed By: alexeib Differential Revision: D24385688 Pulled By: myleott fbshipit-source-id: 72c4a702d93d2854a6409d42913d7413207cb61e	2020-10-19 09:24:04 -07:00
Myle Ott	a48f235636	Apply black+isort (#1357 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1357 Reviewed By: alexeib Differential Revision: D24377772 fbshipit-source-id: 51581af041d42d62166b33a35a1a4228b1a76f0c	2020-10-18 18:14:51 -07:00
Changhan Wang	1d1c145387	speech-to-text OSS Summary: Imported from https://github.com/fairinternal/fairseq-py/pull/1284. Updated according to PR comments. Main changes: * New task: `fairseq.tasks.speech_to_text` * Multilingual support: multiple train sub-splits, temperature-based sampling, language ID tokens * New dataset: `fairseq.data.audio.speech_to_text_dataset` * Added accuracy metrics and BOS prefix removal to label smoothed cross entropy * New models: Transformer (`fairseq.models.speech_to_text.s2t_transformer`) and BLSTM (`fairseq.models.speech_to_text.berard`) * Extended scorers: * Added a base scorer class: `fairseq.scorers.BaseScorer` (the parent class for all scorers except the BLEU scorer in CPP) * Added an evaluation tokenizer: `fairseq.scorers.eval_tokenizer` which leverages sacreBLEU's built-in tokenizers and allows character-level tokenization as well as punctuation removal (for WER scoring). * Added chrF scorer: `fairseq.scorers.chrf` * Online Mel-filter bank speech feature extraction (via CPP-based pyKaldi or Python-based TorchAudio): `fairseq.data.audio.audio_utils` * Online speech feature transforms: `fairseq.data.audio.feature_transforms.` Fixed the subsampled sequence lengths in VGGTransformer (`examples.speech_recognition.models.vggtransformer`) * Examples under `examples/speech_to_text`: * LibriSpeech (ASR): better results than VGGTransformer with smaller Transformer-based models * MuST-C (ST): comparable to [SOTA results](https://arxiv.org/pdf/2004.10234.pdf) but with less tricks Reviewed By: jmp84 Differential Revision: D24065273 fbshipit-source-id: 5f842ca9c826f92d4af660705611885fe440a9ab	2020-10-14 12:30:05 -07:00
Myle Ott	f902a363ab	Small fixes (#1325 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1325 Reviewed By: ngoyal2707 Differential Revision: D24024198 Pulled By: myleott fbshipit-source-id: c3b776970d625eff21a26bf7c86cd28ef9e9d2ef	2020-10-02 10:51:09 -07:00
Mu Tian	e7f76c4481	hydra-fairseq - add dataclass Summary: hydra fairseq - add main common dataclasses as structured config Reviewed By: alexeib Differential Revision: D23375458 fbshipit-source-id: 4cb2802e523990d4e2b1a87e3cf1bc4dc852bc5b	2020-09-04 17:08:30 -07:00
alexeib	621e834103	wav2vec 2.0 (#1220 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1220 Test Plan: Please see examples/wav2vec/README.md for instructions Reviewed By: edunov Differential Revision: D22707565 Pulled By: alexeib fbshipit-source-id: 0c0d4ca7acc933ef7c0062f8dce550b94e414680	2020-08-04 14:19:56 -07:00
Myle Ott	e198482e71	Fix binaries in root dir (#995 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/995 The symlinks approach didn't work with `python train.py`. Differential Revision: D19451900 fbshipit-source-id: 2988eb48077cf8e0e078b9fca527a675132187db	2020-01-17 13:09:09 -08:00
Elijah Rippeth	cec0da2927	add other platforms to CI. (#1595 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)? - [x] Did you make sure to update the docs? - [x] Did you write any new necessary tests? ## What does this PR do? Runs CI for `fairseq` on all major platforms provided by GitHub actions. ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. Pull Request resolved: https://github.com/pytorch/fairseq/pull/1595 Differential Revision: D19438282 Pulled By: myleott fbshipit-source-id: a64db46d7785e6f583848f27699f6463c4dc3170	2020-01-17 00:15:10 -08:00
Jiatao Gu	a316bd99b7	CUDA implementation of Levenshtein distance for NAT training (#960 ) Summary: ## What does this PR do? CUDA implementation for Levenshtein distance for NAT and other potential application. It will make training Levenshtein Transformer slightly faster and clean the functions. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/960 Test Plan: Imported from GitHub. Tested locally. Reviewed By: cndn Differential Revision: D19207096 Pulled By: MultiPath fbshipit-source-id: 4890bbaa851ffd302648c0d949173158dc3167e2	2019-12-21 02:45:15 -08:00
Myle Ott	05514f8a82	Update README to indicate we only support Python >= 3.6 (fixes #1317 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/952 Differential Revision: D19133348 Pulled By: myleott fbshipit-source-id: 51f96ddb13386143fe0088f19f7cb0674755811f	2019-12-16 19:46:53 -08:00
Myle Ott	df2f84ce61	v0.8.0 -> v0.9.0 (#1452 ) Summary: Possibly breaking changes: - Set global numpy seed (`4a7cd58`) - Split `in_proj_weight` into separate k, v, q projections in MultiheadAttention (`fdf4c3e`) - TransformerEncoder returns namedtuples instead of dict (`27568a7`) New features: - Add `--fast-stat-sync` option (`e1ba32a`) - Add `--empty-cache-freq` option (`315c463`) - Support criterions with parameters (`ba5f829`) New papers: - Simple and Effective Noisy Channel Modeling for Neural Machine Translation (`49177c9`) - Levenshtein Transformer (`86857a5`, ...) - Cross+Self-Attention for Transformer Models (`4ac2c5f`) - Jointly Learning to Align and Translate with Transformer Models (`1c66792`) - Reducing Transformer Depth on Demand with Structured Dropout (`dabbef4`) - Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (`e23e5ea`) - BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (`a92bcda`) - CamemBERT: a French BERT (`b31849a`) Speed improvements: - Add CUDA kernels for LightConv and DynamicConv (`f840564`) - Cythonization of various dataloading components (`4fc3953`, ...) - Don't project mask tokens for MLM training (`718677e`) Pull Request resolved: https://github.com/pytorch/fairseq/pull/1452 Differential Revision: D18798409 Pulled By: myleott fbshipit-source-id: 860a0d5aaf7377c8c9bd63cdb3b33d464f0e1727	2019-12-03 15:19:33 -08:00
Myle Ott	cb6c67bcdb	Make torch.hub interface automatically apply tokenization and BPE Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/926 Differential Revision: D18685772 Pulled By: myleott fbshipit-source-id: 0f99d79ed6ee72e9d3ced786d75ab9504d0dfcf0	2019-11-26 07:49:37 -08:00
Myle Ott	4d21c157ad	Have `setup.py clean` remove compiled Cython files Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/907 Differential Revision: D18480215 Pulled By: myleott fbshipit-source-id: b02002f631f6d47380f309d4f464bd135d623280	2019-11-13 10:51:22 -08:00
Myle Ott	a0f75996b1	Fix building of docs Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1340 Differential Revision: D18289455 Pulled By: myleott fbshipit-source-id: a1c8163a35273b6c646d300142701e8a317d7378	2019-11-02 16:52:50 -07:00
Changhan Wang	86857a58bf	Levenshtein Transformer paper code Summary: Code for our NeurIPS paper [Levenshtein Transformer](https://arxiv.org/abs/1905.11006) * Added Levenshtein Transformer model, task and criterion class * Added iterative NAT Transformer, insertion Transformer and CMLM Transformer model class for baselines * Add an option for prepending BOS to dictionary class and translation task class Reviewed By: myleott Differential Revision: D17297372 fbshipit-source-id: 54eca60831ae95dc721c2c34e882e1810ee575c7	2019-09-27 13:58:45 -07:00
Naman Goyal	1f0f7cd82c	added cython to install_requires Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/856 Reviewed By: myleott Differential Revision: D17162411 Pulled By: myleott fbshipit-source-id: e70ecc802398bbba2b5326e9700f2121c422fd18	2019-09-03 09:08:38 -07:00
Myle Ott	8d4588b1ba	Cleaner handling of numpy-based extensions in setup.py Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/853 Differential Revision: D17147879 Pulled By: myleott fbshipit-source-id: b1f5e838533de62ade52fa82112ea5308734c70f	2019-08-31 16:53:34 -07:00
Myle Ott	746e59a262	Improve support for `python setup.py build_ext --inplace` Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/852 Differential Revision: D17147452 Pulled By: myleott fbshipit-source-id: 5fd9c7da3cc019c7beec98d41db1aef1329ee57a	2019-08-31 13:44:22 -07:00
Myle Ott	d2410c4207	Minor cleanup for setup.py Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1078 Differential Revision: D17072514 Pulled By: myleott fbshipit-source-id: 69a8c8c9cc7caa7e04c414329a5d79e6e1a6621c	2019-08-27 10:07:40 -07:00
Naman Goyal	396ff7f59f	installing numpy headers for cython Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/848 Differential Revision: D17060283 fbshipit-source-id: c7e61cae76a0566cc3e2ddc3ab4d48f8dec9d777	2019-08-27 07:11:34 -07:00
Naman Goyal	8a8c0691ba	fix cython dependency in the setup (#847 ) Summary: Fixes broken build for `pytext` `4fc39538ae` Earlier version of setup tools required `cython` to be installed before even starting setup.py. This one fixes it. More details: https://github.com/pypa/setuptools/blob/master/CHANGES.rst#180 and https://stackoverflow.com/questions/37471313/setup-requires-with-cython Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/847 Differential Revision: D16997450 fbshipit-source-id: 5f65026c228a1b94280ca73937078ee3e21ce4f8	2019-08-26 07:19:21 -07:00
Naman Goyal	4fc39538ae	Cythonize token block dataset (#834 ) Summary: Cythonized token block dataset code, it's `> 100x` faster. Token block for entire `bookwiki+CC+stories+openweb` is just ~`39.9` seconds. TODO: 1) I think, I can make it 2x more faster. 2) cleanup. EDIT History: ~~First pass at parellelizing `token_block_dataset`. The code feels somewhat complicated and cluttered. This is 2-3x faster though on my tests on `bookwiki` dataset with both `complete` and `complete_doc` modes. myleott Can you take a look for correctness as I am still not 100% sure that I am not missing corner cases.~~ Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/834 Test Plan: Imported from GitHub, without a `Test Plan:` line. Test workflow: f133816198 Reviewed By: myleott Differential Revision: D16970257 Pulled By: myleott fbshipit-source-id: ec45a308193c9e9f3e7075336c15df4723228d6f	2019-08-23 07:32:36 -07:00
Myle Ott	ffffe04ea1	v0.7.2 -> v0.8.0 (#1017 ) Summary: Changelog: - Relicensed under MIT license - Add RoBERTa - Add wav2vec - Add WMT'19 models - Add initial ASR code - Changed torch.hub interface (`generate` renamed to `translate`) - Add `--tokenizer` and `--bpe` - `f812e52`: Renamed data.transforms -> data.encoders - `654affc`: New Dataset API (optional) - `47fd985`: Deprecate old Masked LM components - `5f78106`: Set mmap as default dataset format and infer format automatically - Misc fixes for sampling - Misc fixes to support PyTorch 1.2 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1017 Differential Revision: D16799880 Pulled By: myleott fbshipit-source-id: 45ad8bc531724a53063cbc24ca1c93f715cdc5a7	2019-08-14 05:02:45 -07:00
Myle Ott	d015d23a1f	Add fairseq-validate Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/765 Differential Revision: D16763357 Pulled By: myleott fbshipit-source-id: 758b03158e486ee82786e2d5bf4e46073b50c503	2019-08-13 13:07:04 -07:00
Myle Ott	abb7ed4c91	Update READMEs for torch.hub Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/795 Differential Revision: D16620488 Pulled By: myleott fbshipit-source-id: 1998a9ccd8816fc7f590861fb4898f910a36bc1e	2019-08-02 06:24:17 -07:00
Myle Ott	e75cff5f2c	Relicense fairseq under MIT license (#786 ) Summary: The previous BSD+PATENTS license was controversial. We have been approved to relicense fairseq under the MIT license. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/786 Differential Revision: D16560654 Pulled By: myleott fbshipit-source-id: f78b1beb4f2895dd7b9bfc79f5f952a2bfb94034	2019-07-30 07:48:23 -07:00
Myle Ott	b002d0096e	v0.7.1 -> v0.7.2 (#891 ) Summary: No major API changes since the last release. Cutting a new release since we'll be merging significant (possibly breaking) changes to logging, data loading and the masked LM implementation soon. Pull Request resolved: https://github.com/pytorch/fairseq/pull/891 Differential Revision: D16377132 Pulled By: myleott fbshipit-source-id: f1cb88e671ccd510e53334d0f449fe18585268c7	2019-07-19 06:33:40 -07:00
Louis MARTIN	cc292afaed	Add specific compile flags for macOS (#862 ) Summary: Fairseq wouldn't install on macOS. A workaround was found here: https://github.com/pytorch/fairseq/issues/289 This is now automatic in setup.py, maybe be there's a cleaner way to do it. I checked that it compiles fine on Linux and macOS. Pull Request resolved: https://github.com/pytorch/fairseq/pull/862 Differential Revision: D16142105 Pulled By: myleott fbshipit-source-id: 998ac7781d7a1ac047f4f9239c1fe16eab4be0dd	2019-07-06 12:31:55 -07:00
Myle Ott	881381cfc7	v0.7.1: fix PyPI setup and tests Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/818 Differential Revision: D15916265 Pulled By: myleott fbshipit-source-id: c66c0bd988d3472c4150226952f34ee8d4c3db86	2019-06-20 06:28:37 -07:00
Myle Ott	bd710e75ae	v0.7.0 (#817 ) Summary: Notable (possibly breaking) changes: - `d45db80`: Remove checkpoint utility functions from utils.py into checkpoint_utils.py - `f2563c2`: Move LM definitions into separate files - `dffb167`: Updates to model API: - `FairseqModel` -> `FairseqEncoderDecoderModel` - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer` - `encoder_out_dict` -> `encoder_out` - rm unused `remove_head` functions - `34726d5`: Move `distributed_init` into `DistributedFairseqModel` - `cf17068`: Simplify distributed launch by automatically launching multiprocessing on each node for all visible GPUs (allows launching just one job per node instead of one per GPU) - `d45db80`: Change default LR scheduler from `reduce_lr_on_plateau` to `fixed` - `96ac28d`: Rename `--sampling-temperature` -> `--temperature` - `fc1a19a`: Deprecate dummy batches - `a1c997b`: Add memory mapped datasets - `0add50c`: Allow cycling over multiple datasets, where each one becomes an "epoch" Plus many additional features and bugfixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/817 Differential Revision: D15913844 Pulled By: myleott fbshipit-source-id: d5b5d678efdd9dd3e4d7ca848ddcf1ec2b21bf6b	2019-06-19 19:08:50 -07:00
Bairen Yi	a8f28ecb63	Python3.5 compat (#794 ) Summary: See #467. Ping myleott to review. This is a work-related contribution. Ping lark to review. Pull Request resolved: https://github.com/pytorch/fairseq/pull/794 Differential Revision: D15756816 Pulled By: myleott fbshipit-source-id: 6dce3ff3a713bf5f60e5782bc260b2ca9d2c0a9b	2019-06-11 04:10:08 -07:00
Myle Ott	66f033e6a2	Update setup.py Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/580 Differential Revision: D14494390 Pulled By: myleott fbshipit-source-id: 524cc16a106f2af630357e2ebdf7dde35fa7d494	2019-03-15 21:30:41 -07:00
Myle Ott	e6422528da	0.6.1 -> 0.6.2 (#577 ) Summary: Changelog: - `998ba4f`: Add language models from Baevski & Auli (2018) - `4294c4f`: Add mixture of experts code from Shen et al. (2019) - `0049349`: Add example for multilingual training - `48d9afb`: Speed improvements, including fused operators from apex - `44d27e6`: Add Tensorboard support - `d17fa85`: Add Adadelta optimizer - `9e1c880`: Add `FairseqEncoderModel` - `b65c579`: Add `FairseqTask.inference_step` to modularize generate.py - `2ad1178`: Add back `--curriculum` - Misc bug fixes and other features Pull Request resolved: https://github.com/pytorch/fairseq/pull/577 Differential Revision: D14481233 Pulled By: myleott fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7	2019-03-15 10:27:01 -07:00
Myle Ott	139e3a3c40	Add sacrebleu to requirements Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/542 Differential Revision: D14258895 Pulled By: myleott fbshipit-source-id: 950a840e1d001a472be8d4737c9e4de5224137b3	2019-02-28 07:54:28 -08:00
Myle Ott	b65c579bed	Modularize generate.py (#351 ) Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/351 This makes it easier for tasks to plugin to generate.py/interactive.py Pull Request resolved: https://github.com/pytorch/fairseq/pull/520 Differential Revision: D14183881 Pulled By: myleott fbshipit-source-id: ede5e53ddc1215ed3b12b8f1eba048c946913c33	2019-02-22 10:08:52 -08:00
Myle Ott	fbd4cef9a5	Add fairseq to PyPI (#495 ) Summary: - fairseq can now be installed via pip: `pip install fairseq` - command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc. Pull Request resolved: https://github.com/pytorch/fairseq/pull/495 Differential Revision: D14017761 Pulled By: myleott fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235	2019-02-08 22:03:29 -08:00
Myle Ott	829bd8ce5f	Add standalone binaries Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/489 Differential Revision: D13956810 Pulled By: myleott fbshipit-source-id: 61ace179d1d3790226c38b3f3e47f5452b5ec514	2019-02-05 07:49:31 -08:00
Sergey Edunov	1082ba352c	Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0 - no more FP16Trainer, we just have an FP16Optimizer wrapper - most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time - Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0 - Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq	2018-09-25 17:36:43 -04:00
Myle Ott	d62a86511e	0.4.0 -> 0.5.0	2018-06-15 12:57:45 -06:00
James Reed	56f9ec3c38	Use ATen built-in conv_tbc method (#66 ) Remove custom ConvTBC code	2018-03-01 23:09:54 -05:00
Myle Ott	6641520612	fairseq-py goes distributed (#106 ) This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes. Changes: - `c7033ef`: add support for distributed training! See updated README for usage. - `e016299`: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc. - `154e440`: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf - `90c2973` and `1da6265`: improve unit test coverage	2018-02-27 17:09:42 -05:00
Myle Ott	18a6d85c88	Add explicit dimension to softmax calls	2018-01-22 10:31:57 -05:00
Myle Ott	13a3c811ca	Version 0.1.0 -> 0.2.0 Release notes: - `5c7f495`: Added simple LSTM model with input feeding and attention - `6e4b7e2`: Refactored model definitions and incremental generation to be cleaner - `7ae79c1`: Split interactive generation out of generate.py and into a new binary: interactive.py - `19a3865`: Subtle correctness fix in beam search decoder. Previously, for a beam size of k, we might emit a hypotheses if the <eos> was among the top 2*k candidates. Now we only emit hypotheses for which the <eos> is among the top-k candidates. This may subtly change generation results, and in the case of k=1 we will now produce strictly greedy outputs. - `97d7fcb`: Fixed bug in padding direction, where previously we right-padded the source and left-padded the target. We now left-pad the source and right-pad the target. This should not effect existing trained models, but may change (usually improves) the quality of new models. - `f442f89`: Add support for batching based on the number of sentences (`--max-sentences`) in addition to the number of tokens (`--max-tokens`). When batching by the number of sentences, one can optionally normalize the gradients by the number of sentences with `--sentence-avg` (the default is to normalize by the number of tokens). - `c6d6256`: Add `--log-format` option and JSON logger	2017-11-11 19:27:27 -07:00
James Reed	30953d8bc1	Fix for building under clang: specify C++ build and use C++ linkage (#42 )	2017-10-24 17:29:52 -04:00
Louis Martin	cb0d7b2ad1	Fix flake8 warnings	2017-10-19 08:06:27 -07:00
Sergey Edunov	e734b0fa58	Initial commit	2017-09-14 17:22:43 -07:00

47 Commits