FP16 (Half Precision) | Floating Point Format Guide

Bit Layout

An FP16 number uses 16 bits divided into three fields:

The sign bit determines positive (0) or negative (1). The exponent is stored with a bias of 15. The mantissa stores 10 bits of fractional precision.

Overview

FP16, also known as half precision, was added to the IEEE 754-2008 standard as a storage and interchange format. At just 16 bits, it uses half the memory of FP32, making it attractive for applications where memory bandwidth is a bottleneck.

With only 5 exponent bits, FP16 has a much narrower range than FP32, with a maximum value of only 65,504. This means large values (like loss values in deep learning) can easily overflow to infinity. However, for values within its range, the 10 mantissa bits provide about 3.3 decimal digits of precision.

FP16 is widely used in ML inference, computer graphics (HDR textures), and as a storage format on GPUs. NVIDIA's Tensor Cores can perform FP16 matrix multiplications at 2× the throughput of FP32.

FP16 vs BF16 FP16 and BF16 are both 16-bit formats, but they make different trade-offs. FP16 has higher precision (10 mantissa bits vs 7) but much narrower range (max ~65K vs ~3.4×10³⁸). BF16's wider range makes it better for training, while FP16's precision makes it good for inference.

Encoding Rules

Normal Numbers

value = (-1)^sign × 2^{(exponent - 15)} × (1 + mantissa / 2¹⁰)

Normal FP16 numbers have a biased exponent between 1 and 30 (actual exponent -14 to +15). The bias of 15 is calculated as 2^(e-1) - 1, where e is the number of exponent bits (5).

Subnormal Numbers

value = (-1)^sign × 2^-14 × (0 + mantissa / 2¹⁰)

Subnormals bridge the gap between the smallest normal number and zero, enabling gradual underflow.

Special Values

Zero: Exponent = 0, Mantissa = 0.
Infinity: Exponent = 31 (all ones), Mantissa = 0.
NaN: Exponent = 31, Mantissa ≠ 0.

Interactive Value Visualizer

Click any bit to flip it, drag the slider, or enter a decimal or hex value. The graphs show how values are distributed across the encoding space.

Decimal:

Hex:

Dynamic Range & Precision

Special Values & Bit Patterns

Format Comparison

Where FP16 Is Used

GPU Tensor Cores: NVIDIA GPUs from Volta onward accelerate FP16 matrix operations via Tensor Cores (.f16 / .f16x2 types in the PTX ISA). On H100, FP16 Tensor Cores deliver 1000 TFLOPS (2000 with sparsity).
Mixed-precision training: NVIDIA's mixed-precision training guide documents the FP16 recipe: forward and backward passes in FP16, FP32 master weights, and dynamic loss scaling to prevent gradient underflow.
ML frameworks: PyTorch provides torch.float16 as a core dtype. The ONNX specification defines FLOAT16 = 10 as an IEEE 754 half-precision type for model interchange.
Kernel compilers: CUTLASS defines half_t for templated MMA abstractions. Triton exports float16 as a first-class kernel dtype, and the MLIR NVVM dialect maps FP16 to Tensor Core MMA shapes.
Graphics and HDR imaging: FP16 is the standard for HDR color channels and texture storage in OpenGL and Vulkan, providing sufficient dynamic range and precision for visual data.

Limited range FP16 overflows above 65,504 and underflows below ~6×10^-8 (normal range). Loss values, gradient norms, and intermediate activations can exceed these bounds. If you encounter frequent infinities or zeros, consider BF16 (wider range) or add loss scaling.