FP8 E5M2 (OCP) | Floating Point Format Guide

Bit Layout

FP8 E5M2 packs a full floating-point number into just 8 bits. The 5 exponent bits give it the same range as FP16, while the 2 mantissa bits provide very coarse precision, with each value distinguishing only 4 levels within each power of 2.

Overview

FP8 E5M2 is one of two 8-bit floating-point formats defined by the OCP (Open Compute Project) Microscaling Specification. It prioritizes dynamic range over precision, making it the FP8 variant better suited for gradient representation during training.

The OCP variant of E5M2 follows standard IEEE 754 rules for special values, supporting both infinity and NaN. This makes it a natural "tiny FP16" since it shares FP16's exponent structure (5 bits, bias 15).

FP8 E5M2 is supported on NVIDIA Hopper (H100) and Ada Lovelace GPUs, AMD MI300, and Intel Gaudi accelerators.

E5M2 vs E4M3 E5M2 has wider range but lower precision (4 values per power of 2). E4M3 has narrower range but higher precision (8 values per power of 2). Typically, E5M2 is used for gradients and E4M3 for activations/weights.

Encoding Rules

Normal Numbers

value = (-1)^sign × 2^{(exponent - 15)} × (1 + mantissa / 4)

With only 2 mantissa bits, there are exactly 4 representable values in each power-of-2 interval: 1.00, 1.25, 1.50, and 1.75 (times the power of 2). The bias of 15 is calculated as 2^(e-1) - 1, where e is the number of exponent bits (5).

Subnormal Numbers

value = (-1)^sign × 2^-14 × (mantissa / 4)

Special Values

Zero: Exponent = 0, Mantissa = 0.
Infinity: Exponent = 31, Mantissa = 0.
NaN: Exponent = 31, Mantissa ≠ 0 (3 possible NaN encodings).

Interactive Value Visualizer

Click any bit to flip it, drag the slider, or enter a decimal or hex value. The graphs show how values are distributed across the encoding space.

Decimal:

Hex:

Dynamic Range & Precision

Special Values & Bit Patterns

Format Comparison

Where FP8 E5M2 Is Used

NVIDIA Hopper Tensor Cores: The H100 architecture supports E5M2 with 5 exponent bits and 2 mantissa bits (max ±57,344, with IEEE-style ±inf and NaN). Its wider dynamic range makes it the standard choice for backward-pass gradients in FP8 training.
OCP Microscaling: The OCP MX Spec v1.0 specifies E5M2 as an alternative MXFP8 element type, using 32-element blocks with E8M0 shared scales.
ML frameworks: PyTorch provides torch.float8_e5m2 as a native dtype (supports NaN/inf, follows IEEE 754). The ONNX specification defines FLOAT8E5M2 = 19 with a note: "mostly used for gradients."
Kernel libraries: CUTLASS defines float_e5m2_t with inline PTX conversion instructions. Triton exports float8e5 as a kernel dtype, and the MLIR AMDGPU dialect supports E5M2 via dot4.f32.bf8 instructions on gfx11+.
NumPy ecosystem: The ml_dtypes library registers float8_e5m2 (and float8_e5m2fnuz) as NumPy custom dtype extensions for use in JAX and TensorFlow pipelines.

E5M2 vs E4M3 in practice Most FP8 training workflows use both formats: E4M3 for forward-pass weights and activations (higher precision), E5M2 for backward-pass gradients (wider dynamic range). MXFP8 block scaling can eliminate the need for E5M2 entirely by using E4M3 for all tensors.