FP6 E2M3 (OCP) | Floating Point Format Guide

Bit Layout

FP6 E2M3 allocates only 2 bits to the exponent and 3 bits to the mantissa. This gives it higher precision than E3M2 (8 values per interval vs 4) but a much narrower range (max 7.5 vs 28).

Overview

FP6 E2M3 is the precision-focused 6-bit format in the OCP Microscaling Specification. Like E3M2, it's designed for use within microscaling blocks rather than as a standalone format.

With 3 mantissa bits, E2M3 can distinguish 8 levels within each power-of-2 interval, giving it the same precision as FP8 E4M3. However, with only 2 exponent bits, its range is extremely limited: the maximum value is only 7.5 and the minimum normal is 1.0.

This format is best suited for data distributions that are tightly clustered (like normalized weights), where the shared block exponent handles the coarse positioning and E2M3 provides fine-grained relative differences.

Encoding Rules

Normal Numbers

value = (-1)^sign × 2^{(exponent - 1)} × (1 + mantissa / 8)

With 3 mantissa bits, the representable values in each interval are: 1.000, 1.125, 1.250, 1.375, 1.500, 1.625, 1.750, 1.875 (times the power of 2). The bias of 1 is calculated as 2^(e-1) - 1, where e is the number of exponent bits (2).

Subnormal Numbers

value = (-1)^sign × 2⁰ × (mantissa / 8)

Subnormals range from 0.125 to 0.875 in steps of 0.125.

Special Values

Zero: Exponent = 0, Mantissa = 0.
No Infinity or NaN: All bit patterns are finite numbers.

Interactive Value Visualizer

Click any bit to flip it, drag the slider, or enter a decimal or hex value. The graphs show how values are distributed across the encoding space.

Decimal:

Hex:

Dynamic Range & Precision

Special Values & Bit Patterns

Format Comparison

Where FP6 E2M3 Is Used

OCP Microscaling (MXFP6): The OCP MX Spec v1.0 defines E2M3 as the higher-precision 6-bit element type in MXFP6 blocks. Its 3 mantissa bits provide 8 distinct values per power-of-two interval - double E3M2's 4.
NVIDIA CUDA: The CUDA Math API defines the __nv_fp6_e2m3 struct with conversion constructors for device-side sub-byte computation.
Kernel libraries: CUTLASS provides float_e2m3_t and mx_float6_t<float_e2m3_t> MX wrapper types. The MLIR AMDGPU dialect supports f6E2M3FN via scaled_ext_packed_matrix on gfx12+.
AMD quantization: AMD Quark provides OCP_MXFP6E2M3Spec for E2M3 MXFP6 quantization of LLMs on ROCm.
Python libraries: The ml_dtypes library registers float6_e2m3fn as a NumPy dtype extension (6-bit, encoding 0bSEEMMM, byte storage, no Inf/NaN).

E2M3 vs E3M2 Both are 6-bit formats with opposite tradeoffs. E2M3 has higher precision (3 mantissa bits) in a narrow range (max 7.5). E3M2 has wider range (max 28) with lower precision (2 mantissa bits). The choice depends on whether your data's spread or its granularity is the limiting factor.