mirror of https://github.com/facebookresearch/fairseq.git synced 2024-10-04 04:37:58 +03:00

History

Myle Ott f34abcf2b6 Use safe_getattr and safe_hasattr (#2347 ) Summary: We use omegaconf.DictConfig objects in non-strict mode, so hasattr behaves weirdly: ``` >>> import omegaconf >>> omegaconf.__version__ '2.0.6' >>> x = omegaconf.DictConfig({"a": 1}) >>> hasattr(x, "foo") True ``` This violates some assumptions in various parts of the code. For example, previously this command was incorrectly missing the final layer norm due to upgrade logic that relied on `hasattr`, but is fixed after this diff: ``` CUDA_VISIBLE_DEVICES=0 python train.py --task dummy_lm --arch transformer_lm_gpt3_small --optimizer adam --lr 0.0001 --max-sentences 8 --log-format json --log-interval 1 ``` Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2347 Reviewed By: alexeib Differential Revision: D31170584 Pulled By: myleott fbshipit-source-id: bd767b7497794314f58f0f8073cdd4332b214006	2021-09-27 10:23:01 -07:00
..
linformer_src	Use safe_getattr and safe_hasattr (#2347 )	2021-09-27 10:23:01 -07:00
README.md	Simplify --user-dir and require user-dir module name to be globally unique (#2815 )	2020-10-29 17:08:20 -07:00

Myle Ott f34abcf2b6 Use safe_getattr and safe_hasattr (#2347 )

Summary:
We use omegaconf.DictConfig objects in non-strict mode, so hasattr behaves weirdly:
```
>>> import omegaconf
>>> omegaconf.__version__
'2.0.6'
>>> x = omegaconf.DictConfig({"a": 1})
>>> hasattr(x, "foo")
True
```

This violates some assumptions in various parts of the code. For example, previously this command was incorrectly missing the final layer norm due to upgrade logic that relied on `hasattr`, but is fixed after this diff:
```
CUDA_VISIBLE_DEVICES=0 python train.py --task dummy_lm --arch transformer_lm_gpt3_small --optimizer adam --lr 0.0001 --max-sentences 8 --log-format json --log-interval 1
```

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/2347

Reviewed By: alexeib

Differential Revision: D31170584

Pulled By: myleott

fbshipit-source-id: bd767b7497794314f58f0f8073cdd4332b214006

2021-09-27 10:23:01 -07:00

linformer_src

Use safe_getattr and safe_hasattr (#2347 )

2021-09-27 10:23:01 -07:00

README.md

Simplify --user-dir and require user-dir module name to be globally unique (#2815 )

2020-10-29 17:08:20 -07:00

README.md

Linformer: Self-Attention with Linear Complexity (Wang et al., 2020)

This example contains code to train Linformer models as described in our paper Linformer: Self-Attention with Linear Complexity.

Training a new Linformer RoBERTa model

You can mostly follow the RoBERTa pretraining README, updating your training command with --user-dir examples/linformer/linformer_src --arch linformer_roberta_base.

Citation

If you use our work, please cite:

@article{wang2020linformer,
  title={Linformer: Self-Attention with Linear Complexity},
  author={Wang, Sinong and Li, Belinda and Khabsa, Madian and Fang, Han and Ma, Hao},
  journal={arXiv preprint arXiv:2006.04768},
  year={2020}
}