From d5f76d7446537fc7d6252c4b1ea5836e02647948 Mon Sep 17 00:00:00 2001 From: Khoa Ho <25312735+khoa-ho@users.noreply.github.com> Date: Thu, 30 May 2019 12:02:58 -0700 Subject: [PATCH] Clarify mixed precision training support (#766) Summary: Change the wording to avoid confusion. Mixed precision ensures both higher arithmetic throughput and numerical stability, not exactly synonymous to pure half-precision/FP16 training. Also add mentioning of tensor cores since older generation GPUs without tensor cores don't support true mixed precision training. Pull Request resolved: https://github.com/pytorch/fairseq/pull/766 Differential Revision: D15559565 Pulled By: myleott fbshipit-source-id: c71e720772657bb3e8ad330b58bf69e23beb614e --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index c18dfb280..4f4dfdb46 100644 --- a/README.md +++ b/README.md @@ -28,7 +28,7 @@ Fairseq features: - Diverse Beam Search ([Vijayakumar et al., 2016](https://arxiv.org/abs/1610.02424)) - sampling (unconstrained and top-k) - large mini-batch training even on a single GPU via delayed updates -- fast half-precision floating point (FP16) training +- mixed precision training (trains faster with less GPU memory on [NVIDIA tensor cores](https://developer.nvidia.com/tensor-cores)) - extensible: easily register new models, criterions, tasks, optimizers and learning rate schedulers We also provide [pre-trained models](#pre-trained-models-and-examples) for several benchmark