Summary:
this adds a hydra_train binary that uses hydra configs/command line overrides instead of argparse
use case 1: built in configs + overrides from command line
```
python fairseq_cli/hydra_train.py distributed_training.distributed_world_size=1 dataset.batch_size=2 task.data=/private/home/myleott/data/data-bin/wikitext-103-roberta-bpe-bin/ model=transformer_lm/transformer_lm_gpt task=language_modeling optimization.max_update=5000
```
use case 2: use an external config that is used instead of bundled configs (but dataclass defaults still work)
```
python fairseq_cli/hydra_train.py --config-path ~/fairseq-py-dev/lm --config-name wiki103
```
the config file contains this:
```
# package _group_
model:
_name: transformer_lm
distributed_training:
distributed_world_size: 1
dataset:
batch_size: 2
task:
_name: language_modeling
data: /private/home/myleott/data/data-bin/wikitext-103-roberta-bpe-bin/
add_bos_token: false
max_target_positions: 1024
optimization:
max_update: 50000
lr: [ 0.25 ]
criterion: cross_entropy
optimizer: adam
lr_scheduler:
_name: cosine
```
use case 3: use an external config directory that provides additional configs for e.g. models
python fairseq_cli/hydra_train.py distributed_training.distributed_world_size=1 dataset.batch_size=2 task.data=/private/home/myleott/data/data-bin/wikitext-103-roberta-bpe-bin/ model=transformer_lm/2_layers task=language_modeling optimization.max_update=5000 --config-dir ~/fairseq-py-dev/lm/hydra
where ~/fairseq-py-dev/lm/hydra has the following structure:
- model
-- transformer_lm
--- 2_layers.yaml
and inside 2_layers.yaml is a copy of transformer_lm_gpt.yaml but with decoder_layers set to 2
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1393
Reviewed By: myleott
Differential Revision: D24722252
Pulled By: alexeib
fbshipit-source-id: 758ea431fa099cd7c0e4daf41eff680df1d3b841
Summary:
This diff is based on feedback in D24379649
Before when loading checkpoints:
Each rank loads the checkpoint from Manifold.
Now:
Rank 0 loads checkpoint from Manifold. This checkpoint is broadcasted to all other ranks. This saves IO.
Furthermore, when doing zero-sharding, we only broadcast the relevant parts of the optimizer state to each node. This makes checkpoint loading more memory-efficient and should enable loading models beyond 2-3B parameters.
Reviewed By: myleott
Differential Revision: D24660791
fbshipit-source-id: e30b2ea5990083375e4549f0427a112346ba170d
Summary:
- Set default value of clip-norm back to 0.0 (disabled)
- Add comment explaining that we divide loss by log(2) to covert the base
- Fix `--zero-optimizer=os` (fixes#2811)
- Update requirements to PyTorch >= 1.5
- Fix bug in fixed LR schedule
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1392
Reviewed By: alexeib
Differential Revision: D24714231
Pulled By: myleott
fbshipit-source-id: 63dc8cfc74683bbccbf05b44228014eb12ddbfc7
Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2833
Add support for filling masks using BART on a batch of sentences. This will be helpful when running on GPU
Reviewed By: myleott
Differential Revision: D24687773
fbshipit-source-id: 1b8005c18a09be526f40e9e2b99207afa38e0f1a
Summary: In past, we always use shared dictionary for multilingual experiments. This diff renables different dictionaries for source and target languages by changing the assertion criteria and reverts back to use specific languages to return source_dict and target_dict.
Reviewed By: chtran
Differential Revision: D24637682
fbshipit-source-id: a982e4f1e48395cc5bf10dc03b98fbe970062f8d
Summary:
This PR reverts recent changes that attempted to make `--user-dir` work with non-unique module names. But that new approach introduced other issues (e.g., poor compatibility with multiprocessing and Windows), so let's revert to the previous simpler implementation.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2815
Reviewed By: alexeib
Differential Revision: D24611571
Pulled By: myleott
fbshipit-source-id: cecfe28395585ca0401f844f10bd0d49d014c4d8
Summary:
…d of cfg.task
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2813
Reviewed By: alexeib
Differential Revision: D24604698
Pulled By: myleott
fbshipit-source-id: e41996147203ec47274ded803bab910460a19eb3
Summary:
# Before submitting
- [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes https://github.com/pytorch/fairseq/issues/1205
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2801
Reviewed By: alexeib
Differential Revision: D24579193
Pulled By: myleott
fbshipit-source-id: bcb14bb588d4538398bff4114e0a387fd29818c5
Summary:
Configs can either be in `/fairseq/configs` (once the package is installed) or `/configs` (if using an editable installation). This centralizes the hydra init and supports these two possible config locations.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2784
Reviewed By: alexeib
Differential Revision: D24513586
Pulled By: myleott
fbshipit-source-id: 8e10a88177ebcf809d5d37d448d2b384142febef
Summary:
# What does this PR do?
Addresses https://github.com/pytorch/fairseq/issues/2772 where external users can't generate using the model because the README is currently not accurate.
This PR fixes the issues in the README
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1366
Reviewed By: edunov
Differential Revision: D24455634
Pulled By: shruti-bh
fbshipit-source-id: 480a11f8b95d1278162d585700e58d467a35d35a
Summary:
Fixed link.
# Before submitting
- [-] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [+] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [+] Did you make sure to update the docs?
- [-] Did you write any new necessary tests?
## What does this PR do?
Fixes link.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2796
Reviewed By: nlaptev
Differential Revision: D24538759
Pulled By: myleott
fbshipit-source-id: af947f432c34ca2aec35c9fe59dd1214e363450b
Summary: Reverting the diff because it has already been fixed in https://github.com/pytorch/pytorch/pull/45413
Reviewed By: myleott
Differential Revision: D24511658
fbshipit-source-id: a5561dae50d69a03443ca8a60bebe2cd064e3ee0
Summary:
- refactor dataclass/ hierarchy to make it a bit more sane (while avoiding circular references)
- add top level FairseqConfig
- change typehints to reflect the correct config type if it is known
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1371
Reviewed By: myleott
Differential Revision: D24469026
Pulled By: alexeib
fbshipit-source-id: 01f68918f761d51ec5216286b8959ad35f41a7b2
Summary:
some binaries (e.g. speech based ones) used --post-process, some used --remove-bpe. --post-process seems more appropriate as it does more than just remove bpe at the moment. this renames remove_bpe to post_process, adds alias so existing command lines would work and adds checkpoint upgrades so they continue to work also.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1369
Reviewed By: myleott
Differential Revision: D24465040
Pulled By: alexeib
fbshipit-source-id: 1b3e388291ccc403e76e069ef6606b80ead863a7
Summary:
This will produce version strings like `1.0.0a0+3065963`, similar to PyTorch version strings.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2774
Reviewed By: alexeib
Differential Revision: D24453517
Pulled By: myleott
fbshipit-source-id: 03a0c324ed6124bbc513ba7edc954abd71d63a0f
Summary:
support new cfg based models; make sure --normalize is consistent in infer with the model
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1370
Reviewed By: myleott
Differential Revision: D24467698
Pulled By: alexeib
fbshipit-source-id: 056b3608e3c1fe8acdb3e45e0306de5d874cb4d1
Summary:
# Before submitting
- [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [x] Did you make sure to update the docs?
No need I believe
- [x] Did you write any new necessary tests?
No
## What does this PR do?
Fixes https://github.com/pytorch/fairseq/issues/2724
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Yes! It is not a big PR at all but it allowed me to familiarize with the caching/downloading logic used in fairseq (which is very similar to that used in pytorch/transformers)
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2767
Reviewed By: edunov
Differential Revision: D24456055
Pulled By: myleott
fbshipit-source-id: bc634a9b97f957ecc5a8da57b112ff892e492107
Summary:
Args should be registered in the Model rather than modules
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1365
Reviewed By: pipibjc
Differential Revision: D24453007
Pulled By: myleott
fbshipit-source-id: d22b0d86a3c940456b394b005acab4bb6a3f5bed
Summary:
Typically `torch.hub.load(...)` doesn't call `pip install`, so our Cython components never get built. We have a hack in our hubconf that builds these components by running the equivalent of `python setup.py build_ext --inplace` using the setuptools sandbox: f6677b6755/hubconf.py (L52-L55).
Unfortunately, this sandbox gets mad if you modify the filesystem, which is what this recent change does: f6677b6755/setup.py (L203-L205). Combined this breaks torch.hub.
The solution is that when we're doing `build_ext`, don't setup the symlinks. This is fine, since `build_ext` doesn't actually build a package, so we don't care about including config or examples.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2762
Reviewed By: alexeib
Differential Revision: D24430228
Pulled By: myleott
fbshipit-source-id: e05d075a003ddfde196cb8a86b32882d73808015
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?
## What does this PR do?
It's sufficient to set logging.basicConfig in the most outside calling code like train.py or generate.py. Actually the setting of logging.basicConfig () (like [here](https://github.com/pytorch/fairseq/blob/master/fairseq_cli/generate.py#L54)) will been overwritten if logging.basicConfig is set in the inner part of the whole code.
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2733
Reviewed By: alexeib
Differential Revision: D24418987
Pulled By: myleott
fbshipit-source-id: 862d200023357de8947799f380e513f4c411b143
Summary:
# Before submitting
- [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [x] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?
## What does this PR do?
Upgrade args: `max_sentences` to `batch_size`
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/pytorch/fairseq/pull/2754
Reviewed By: alexeib
Differential Revision: D24418980
Pulled By: myleott
fbshipit-source-id: 5269c2fc8c434513cc5114f7e9d2eccd0c553fbd
Summary:
Fixes issue #2761 and #2760
args from registries were not added to argparse
Reviewed By: myleott
Differential Revision: D24422792
fbshipit-source-id: c8a8e835965da5c4f527bd589bd621371441e7fe
Summary:
Pull Request resolved: https://github.com/facebookresearch/pytext/pull/1510
this is the main pr that switches on hydra functionality in fairseq
we migrate "args" object into omegaconf "DictConfig" at all legacy entry points
in addition this migrates various components from secondary registries (like bpe encoders and tokenizers) to make the migration smoother
i am going through code that references migrated fairseq components and changing it to inherit from "Legacy*" components instead. hopefully tests will catch most of this
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1343
Reviewed By: myleott
Differential Revision: D23973928
Pulled By: alexeib
fbshipit-source-id: dd9554981fff51ea75c1ff343874d1d6e61793c9
Summary:
# Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
## What does this PR do?
Fixes # (issue).
## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1360
Reviewed By: myleott
Differential Revision: D24393217
Pulled By: huihuifan
fbshipit-source-id: a110ef6958b1e15cd8c4e23b610db5cfc994f06d