Commit Graph

10 Commits

Author SHA1 Message Date
alexeib
e3c4282551 remove max_sentences from args, use batch_size instead (#1333)
Summary:
now that we are moving to using dataclasses to define fairseq configuration, having aliases for options is no longer practical. this pr removes "max-sentences" argument while keeping its alias "batch-size", which is more appropriate

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1333

Reviewed By: shruti-bh

Differential Revision: D24121305

Pulled By: alexeib

fbshipit-source-id: 34343cea54c8f2c8b059c38ef9f29b66e76df9fb
2020-10-05 19:09:01 -07:00
Myle Ott
1cc8e95cec Don't cache epoch iterators when using sharded datasets (#1268)
Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1268

We previously had a memory leak when using sharded datasets. In particular,
each sharded dataset is a new FairseqDataset instance, and the cache is keyed
by the `dataset` instance. Since we never clear the cache, this would
eventually cause the system to run out of CPU RAM.

This diff disables caching when using sharded datasets.

Note that we also change the signature to `get_batch_iterator`, which needs to
propagate to many places. We previously avoided this update when adding
`data_buffer_size`, so I'm also adding that everywhere.

Reviewed By: ngoyal2707

Differential Revision: D23319135

fbshipit-source-id: 6bcd6aee141ad9cc234448c49106a8dbf8ea1800
2020-09-09 06:20:31 -07:00
Myle Ott
fe1b1bbe17 Misc fixes (#2524)
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/2524

Reviewed By: ngoyal2707

Differential Revision: D23318746

Pulled By: myleott

fbshipit-source-id: 6db6a87aac178847bd0da26db09b1a63632a724f
2020-08-31 11:29:21 -07:00
Myle Ott
2f7e3f3323 Support multi-GPU validation in fairseq-validate (#2162)
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/2162

Reviewed By: ngoyal2707

Differential Revision: D21663181

Pulled By: myleott

fbshipit-source-id: d01e64f97482f76bd601cd8b20232c0ef637bb8a
2020-05-27 10:24:54 -07:00
Naman Goyal
d37fdee3da adding code to load and save model parallel checkpoint (#1119)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1119

Reviewed By: myleott

Differential Revision: D20712488

fbshipit-source-id: 941ef251c9e2deb8933d88188fac56ee8c5be9b7
2020-03-27 20:22:36 -07:00
Naman Goyal
f3680fd804 adding eval lm changes for model parallel (#1113)
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding �
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1113

Reviewed By: myleott

Differential Revision: D20670665

fbshipit-source-id: 8e2846637195b7200f1f60a8421d2fe5ffab789b
2020-03-27 15:22:57 -07:00
Myle Ott
aa79bb9c37 Use 1-based indexing for epochs everywhere (#1053)
Summary:
We are somewhat inconsistent in whether we're using 0-based or 1-based indexing for epochs. This should fix things to be 0-based internally, with logging and checkpoint naming still using 1-based indexing.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1053

Reviewed By: spencerp

Differential Revision: D20160715

Pulled By: myleott

fbshipit-source-id: 4ed94f9c371e1bfe29bcfa087fa6756507d6e627
2020-03-04 16:37:24 -08:00
Myle Ott
077c351d7e Fix comments and logger name in validate.py (#1061)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1061

Differential Revision: D20238796

Pulled By: myleott

fbshipit-source-id: cf48bd7f6cdae05e91868a9d2efd91dc8e72bb12
2020-03-04 14:05:24 -08:00
Myle Ott
f8b795f427 Move meters, metrics and progress_bar into fairseq.logging (#1046)
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1046

Differential Revision: D20030412

Pulled By: myleott

fbshipit-source-id: bd87391aa9cdb73306ee90a30eeb2bdeff3690f9
2020-02-27 08:24:59 -08:00
Myle Ott
b488e1fe56 Reverse symlinks in root and fairseq_cli (2/3)
Summary: This is needed to support other build environments (e.g., Windows)

Reviewed By: ngoyal2707

Differential Revision: D19409984

fbshipit-source-id: e970510781abf92f1b02d0961bc30e1210b524dd
2020-01-17 08:26:20 -08:00