4-bit OCP floating-point format with only 16 possible values
FP4 E2M1 packs a floating-point number into just 4 bits. With 1 sign bit, 2 exponent bits, and just 1 mantissa bit, it can represent only 16 distinct values (8 positive, 8 negative including ±0).
FP4 E2M1 is defined in the OCP Microscaling Specification as the smallest element format for microscaling blocks. At 4 bits per element, it achieves 8× compression compared to FP32.
With only 1 mantissa bit, each power-of-2 interval contains exactly 2 values: X.0 and X.5 (where X is the integer part with the implicit leading 1). For example, between 1 and 2, the only representable values are 1.0 and 1.5.
Like other OCP microscaling formats, FP4 E2M1 does not support infinity or NaN. All 16 bit patterns map to finite numbers. It's always used with a shared block exponent that extends its effective range.
With 1 mantissa bit, there are only 2 values per power-of-2 interval: 1.0 and 1.5 (times the power of 2). The bias of 1 is calculated as 2(e-1) - 1, where e is the number of exponent bits (2).
The only subnormal value is 0.5 (mantissa = 1).
Here are all 16 values that FP4 E2M1 can represent:
Click any bit to flip it, drag the slider, or enter a decimal or hex value. The graphs show how values are distributed across the encoding space.
Since there are only 16 possible bit patterns, here is the complete enumeration:
__nv_fp4_e2m1 with constructors and conversion operators. CUTLASS provides both mx_float4_t (OCP, 32-element blocks) and nv_float4_t (NVIDIA, 16-element blocks).torch.float4_e2m1f_x2 (2 packed values per byte) with E8M0 block scaling. The ONNX proto defines FLOAT4E2M1 with LSB-first packing.OCP_MXFP4Spec for MXFP4 quantization on ROCm. The MLIR AMDGPU dialect supports f4E2M1FN in WMMA instructions on gfx12/RDNA4.float4_e2m1fn as a NumPy custom dtype (possible values: 0, 0.5, 1, 1.5, 2, 3, 4, 6).