OCP microscaling format: 3 exponent bits, 2 mantissa bits, no infinity or NaN
FP6 E3M2 squeezes a floating-point number into just 6 bits. The 3 exponent bits provide a modest range, while the 2 mantissa bits give 4 values per power-of-2 interval.
FP6 E3M2 is part of the OCP Microscaling (MX) Specification, designed for extreme quantization in machine learning. With only 64 possible bit patterns (6 bits), it represents a tiny subset of real numbers.
This format is never used alone. It's designed to be used within microscaling blocks, where a shared block exponent provides additional range. The 6-bit element captures the relative differences within a block, while the shared exponent positions the entire block on the number line.
FP6 E3M2 has more range than its sibling E2M3 (3 exponent bits vs 2), but less precision (2 mantissa bits vs 3). It does not support infinity or NaN, so all 64 bit patterns represent finite numbers (or zero).
The bias of 3 is calculated as 2(e-1) - 1, where e is the number of exponent bits (3).
Click any bit to flip it, drag the slider, or enter a decimal or hex value. The graphs show how values are distributed across the encoding space.
__nv_fp6_e3m2 struct with constructors and rounding conversion operators for device-side sub-byte floating-point operations.float_e3m2_t and mx_float6_t<float_e3m2_t> for MX block-scaled GPU matrix kernels. The MLIR AMDGPU dialect supports f6E3M2FN via scaled_ext_packed_matrix on gfx12+.OCP_MXFP6E3M2Spec for quantizing LLMs to MXFP6 E3M2 format using per-block E8M0 scales on ROCm.float6_e3m2fn as a NumPy dtype extension (6-bit, encoding 0bSEEEMM, byte storage, no Inf/NaN). A PyTorch RFC proposes adding torch.float6_e3m2fn to the PT2 stack.