FP6 E3M2 (OCP) | Floating Point Format Guide

Bit Layout

FP6 E3M2 squeezes a floating-point number into just 6 bits. The 3 exponent bits provide a modest range, while the 2 mantissa bits give 4 values per power-of-2 interval.

Overview

FP6 E3M2 is part of the OCP Microscaling (MX) Specification, designed for extreme quantization in machine learning. With only 64 possible bit patterns (6 bits), it represents a tiny subset of real numbers.

This format is never used alone. It's designed to be used within microscaling blocks, where a shared block exponent provides additional range. The 6-bit element captures the relative differences within a block, while the shared exponent positions the entire block on the number line.

FP6 E3M2 has more range than its sibling E2M3 (3 exponent bits vs 2), but less precision (2 mantissa bits vs 3). It does not support infinity or NaN, so all 64 bit patterns represent finite numbers (or zero).

Encoding Rules

Normal Numbers

value = (-1)^sign × 2^{(exponent - 3)} × (1 + mantissa / 4)

The bias of 3 is calculated as 2^(e-1) - 1, where e is the number of exponent bits (3).

Subnormal Numbers

value = (-1)^sign × 2^-2 × (mantissa / 4)

Special Values

Zero: Exponent = 0, Mantissa = 0.
No Infinity: All exponent patterns represent finite numbers.
No NaN: All bit patterns are valid numbers.

All patterns are numbers Since FP6 E3M2 has no infinity or NaN, every one of its 64 bit patterns (32 positive, 32 negative including ±0) maps to a real number. The maximum exponent (7) represents normal numbers, not special values.

Interactive Value Visualizer

Click any bit to flip it, drag the slider, or enter a decimal or hex value. The graphs show how values are distributed across the encoding space.

Decimal:

Hex:

Dynamic Range & Precision

Special Values & Bit Patterns

Format Comparison

Where FP6 E3M2 Is Used

OCP Microscaling (MXFP6): The OCP MX Spec v1.0 defines E3M2 as the wider-range 6-bit element type in MXFP6 blocks (32 elements, E8M0 shared scale). With 3 exponent bits it covers values up to 28 - nearly 4× E2M3's range.
NVIDIA CUDA: The CUDA Math API defines the __nv_fp6_e3m2 struct with constructors and rounding conversion operators for device-side sub-byte floating-point operations.
Kernel libraries: CUTLASS provides float_e3m2_t and mx_float6_t<float_e3m2_t> for MX block-scaled GPU matrix kernels. The MLIR AMDGPU dialect supports f6E3M2FN via scaled_ext_packed_matrix on gfx12+.
AMD quantization: AMD Quark provides OCP_MXFP6E3M2Spec for quantizing LLMs to MXFP6 E3M2 format using per-block E8M0 scales on ROCm.
Python libraries: The ml_dtypes library registers float6_e3m2fn as a NumPy dtype extension (6-bit, encoding 0bSEEEMM, byte storage, no Inf/NaN). A PyTorch RFC proposes adding torch.float6_e3m2fn to the PT2 stack.

E3M2 vs E2M3 E3M2 has wider range (max 28 vs 7.5) but fewer distinct values per interval (4 vs 8). Choose E3M2 when your values span a wider range; choose E2M3 when you need finer granularity within a narrower range.