TF32 - TensorFloat-32

NVIDIA's 19-bit hybrid: FP32 range meets FP16 precision for accelerated tensor math

Bit Layout

A TF32 number uses 19 bits divided into three fields:

TF32 combines the 8-bit exponent of FP32/BF16 with the 10-bit mantissa of FP16, giving it FP32's range with FP16's precision.

Overview

TF32 (TensorFloat-32) was introduced by NVIDIA with the Ampere architecture (A100 GPU) in 2020. Despite the "32" in its name, TF32 is actually a 19-bit format. The name reflects that it is used as a drop-in replacement for FP32 in tensor operations.

TF32 is unique because it's not a storage format: data is stored in FP32 memory layout, and TF32 is only used internally by Tensor Cores during matrix multiply-accumulate operations. The GPU automatically truncates FP32 inputs to TF32 precision (10 mantissa bits) before computation, then accumulates results in FP32.

This gives up to 8× speedup over FP32 matrix math on A100, with negligible accuracy loss for most deep learning workloads. On NVIDIA GPUs, TF32 is enabled by default for torch.matmul and torch.nn.Linear.

Not a storage format You'll never see TF32 tensors in memory. TF32 is a compute mode: the GPU truncates FP32 inputs on-the-fly during Tensor Core operations. Inputs and outputs remain in FP32 format.

Encoding Rules

Normal Numbers

value = (-1)sign × 2(exponent - 127) × (1 + mantissa / 210)

TF32 follows FP32's rules with the exponent, but with only 10 mantissa bits (like FP16). The bias of 127 is calculated as 2(e-1) - 1, where e is the number of exponent bits (8), identical to FP32 and BF16.

Subnormal Numbers

value = (-1)sign × 2-126 × (0 + mantissa / 210)

Special Values

Interactive Value Visualizer

Click any bit to flip it, drag the slider, or enter a decimal or hex value. The graphs show how values are distributed across the encoding space.

Dynamic Range & Precision

Special Values & Bit Patterns

Format Comparison

Where TF32 Is Used

Not a general-purpose format TF32 is only used internally by NVIDIA Tensor Cores during specific operations (matmul, convolutions). You cannot store tensors in TF32 or select it as a dtype in frameworks. If you need a 16-bit storage format with similar precision, use FP16 (same 10-bit mantissa) instead.