fairseq

mirror of https://github.com/facebookresearch/fairseq.git synced 2024-08-16 20:10:40 +03:00

Author	SHA1	Message	Date
Guillaume Wenzek	699ab19014	run all tests (#4733 ) * run all tests * make torch a build-time dependency * add 'dev' extra deps to install black, flake, pytest at once * Build docs in CI This should also help catch some import bugs, since sphinx inspect a lot of code * CI should do the real install not "--editable" * check installation succeeded * add missing __init__.py file * add check installation * move check_installation.py to its own script * fix pytest import mode, force recent numpy, torch * run black before flake and tests * torch >= 1.10.0 * use torch 1.10 for GPU tests	2022-09-23 18:40:50 +02:00
Diana Liskovich	5adfeaccf9	Rename references from master -> main in preparation for branch name change (#2297 ) Summary: # Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests? ## What does this PR do? Fixes # (issue). ## PR review Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged. ## Did you have fun? Make sure you had fun coding � Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2297 Reviewed By: alexeib Differential Revision: D30906090 Pulled By: dianaml0 fbshipit-source-id: 941d30db7f766c9077a1b5bb2a04680f57e2e070	2021-09-20 08:29:38 -07:00
Myle Ott	e0737c3c29	Dynamically generate versions based on commit hash (#2774 ) Summary: This will produce version strings like `1.0.0a0+3065963`, similar to PyTorch version strings. Pull Request resolved: https://github.com/pytorch/fairseq/pull/2774 Reviewed By: alexeib Differential Revision: D24453517 Pulled By: myleott fbshipit-source-id: 03a0c324ed6124bbc513ba7edc954abd71d63a0f	2020-10-22 12:51:04 -07:00
Myle Ott	a48f235636	Apply black+isort (#1357 ) Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1357 Reviewed By: alexeib Differential Revision: D24377772 fbshipit-source-id: 51581af041d42d62166b33a35a1a4228b1a76f0c	2020-10-18 18:14:51 -07:00
Myle Ott	df2f84ce61	v0.8.0 -> v0.9.0 (#1452 ) Summary: Possibly breaking changes: - Set global numpy seed (`4a7cd58`) - Split `in_proj_weight` into separate k, v, q projections in MultiheadAttention (`fdf4c3e`) - TransformerEncoder returns namedtuples instead of dict (`27568a7`) New features: - Add `--fast-stat-sync` option (`e1ba32a`) - Add `--empty-cache-freq` option (`315c463`) - Support criterions with parameters (`ba5f829`) New papers: - Simple and Effective Noisy Channel Modeling for Neural Machine Translation (`49177c9`) - Levenshtein Transformer (`86857a5`, ...) - Cross+Self-Attention for Transformer Models (`4ac2c5f`) - Jointly Learning to Align and Translate with Transformer Models (`1c66792`) - Reducing Transformer Depth on Demand with Structured Dropout (`dabbef4`) - Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (`e23e5ea`) - BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (`a92bcda`) - CamemBERT: a French BERT (`b31849a`) Speed improvements: - Add CUDA kernels for LightConv and DynamicConv (`f840564`) - Cythonization of various dataloading components (`4fc3953`, ...) - Don't project mask tokens for MLM training (`718677e`) Pull Request resolved: https://github.com/pytorch/fairseq/pull/1452 Differential Revision: D18798409 Pulled By: myleott fbshipit-source-id: 860a0d5aaf7377c8c9bd63cdb3b33d464f0e1727	2019-12-03 15:19:33 -08:00
Myle Ott	a0f75996b1	Fix building of docs Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1340 Differential Revision: D18289455 Pulled By: myleott fbshipit-source-id: a1c8163a35273b6c646d300142701e8a317d7378	2019-11-02 16:52:50 -07:00
Myle Ott	ffffe04ea1	v0.7.2 -> v0.8.0 (#1017 ) Summary: Changelog: - Relicensed under MIT license - Add RoBERTa - Add wav2vec - Add WMT'19 models - Add initial ASR code - Changed torch.hub interface (`generate` renamed to `translate`) - Add `--tokenizer` and `--bpe` - `f812e52`: Renamed data.transforms -> data.encoders - `654affc`: New Dataset API (optional) - `47fd985`: Deprecate old Masked LM components - `5f78106`: Set mmap as default dataset format and infer format automatically - Misc fixes for sampling - Misc fixes to support PyTorch 1.2 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1017 Differential Revision: D16799880 Pulled By: myleott fbshipit-source-id: 45ad8bc531724a53063cbc24ca1c93f715cdc5a7	2019-08-14 05:02:45 -07:00
Myle Ott	b002d0096e	v0.7.1 -> v0.7.2 (#891 ) Summary: No major API changes since the last release. Cutting a new release since we'll be merging significant (possibly breaking) changes to logging, data loading and the masked LM implementation soon. Pull Request resolved: https://github.com/pytorch/fairseq/pull/891 Differential Revision: D16377132 Pulled By: myleott fbshipit-source-id: f1cb88e671ccd510e53334d0f449fe18585268c7	2019-07-19 06:33:40 -07:00
Myle Ott	881381cfc7	v0.7.1: fix PyPI setup and tests Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/818 Differential Revision: D15916265 Pulled By: myleott fbshipit-source-id: c66c0bd988d3472c4150226952f34ee8d4c3db86	2019-06-20 06:28:37 -07:00
Myle Ott	bd710e75ae	v0.7.0 (#817 ) Summary: Notable (possibly breaking) changes: - `d45db80`: Remove checkpoint utility functions from utils.py into checkpoint_utils.py - `f2563c2`: Move LM definitions into separate files - `dffb167`: Updates to model API: - `FairseqModel` -> `FairseqEncoderDecoderModel` - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer` - `encoder_out_dict` -> `encoder_out` - rm unused `remove_head` functions - `34726d5`: Move `distributed_init` into `DistributedFairseqModel` - `cf17068`: Simplify distributed launch by automatically launching multiprocessing on each node for all visible GPUs (allows launching just one job per node instead of one per GPU) - `d45db80`: Change default LR scheduler from `reduce_lr_on_plateau` to `fixed` - `96ac28d`: Rename `--sampling-temperature` -> `--temperature` - `fc1a19a`: Deprecate dummy batches - `a1c997b`: Add memory mapped datasets - `0add50c`: Allow cycling over multiple datasets, where each one becomes an "epoch" Plus many additional features and bugfixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/817 Differential Revision: D15913844 Pulled By: myleott fbshipit-source-id: d5b5d678efdd9dd3e4d7ca848ddcf1ec2b21bf6b	2019-06-19 19:08:50 -07:00
Myle Ott	e6422528da	0.6.1 -> 0.6.2 (#577 ) Summary: Changelog: - `998ba4f`: Add language models from Baevski & Auli (2018) - `4294c4f`: Add mixture of experts code from Shen et al. (2019) - `0049349`: Add example for multilingual training - `48d9afb`: Speed improvements, including fused operators from apex - `44d27e6`: Add Tensorboard support - `d17fa85`: Add Adadelta optimizer - `9e1c880`: Add `FairseqEncoderModel` - `b65c579`: Add `FairseqTask.inference_step` to modularize generate.py - `2ad1178`: Add back `--curriculum` - Misc bug fixes and other features Pull Request resolved: https://github.com/pytorch/fairseq/pull/577 Differential Revision: D14481233 Pulled By: myleott fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7	2019-03-15 10:27:01 -07:00
Myle Ott	fbd4cef9a5	Add fairseq to PyPI (#495 ) Summary: - fairseq can now be installed via pip: `pip install fairseq` - command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc. Pull Request resolved: https://github.com/pytorch/fairseq/pull/495 Differential Revision: D14017761 Pulled By: myleott fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235	2019-02-08 22:03:29 -08:00
Sergey Edunov	1082ba352c	Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0 - no more FP16Trainer, we just have an FP16Optimizer wrapper - most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time - Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0 - Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq	2018-09-25 17:36:43 -04:00
Myle Ott	4a47b88992	Update documentation	2018-09-03 20:03:37 -04:00
Myle Ott	6381cc977f	Add documentation	2018-09-03 19:15:23 -04:00

15 Commits