Sub-byte unsigned integer: quantization packing and nibble-level data
A 4-bit unsigned integer (also called a nibble) uses all 4 bits for magnitude, giving a range of 0 to 15 (24 - 1). Two UINT4 values pack into a single byte, and each corresponds to one hexadecimal digit (0–F).
UINT4 is a sub-byte format primarily used as a packing container for quantized ML weights. While INT4 (signed) is more common in quantization schemes like GPTQ and AWQ, UINT4 appears in frameworks that use asymmetric quantization with a non-zero zero-point offset.
With only 16 possible values (0–15), UINT4 achieves the same 8× compression over FP32 as INT4. Two UINT4 values are packed into a single byte, with the first element in the least-significant nibble.
In asymmetric quantization, the float-to-integer mapping includes a zero point:
The zero point shifts the mapping so that zero maps to a non-zero integer, allowing the full 0–15 range to cover an asymmetric distribution of float values.
With only 16 possible bit patterns, here is the complete set:
With only 4 bits, you can click through every possible value. Each nibble corresponds to one hex digit.
UINT4 in TensorProto.DataType with a packing rule of two 4-bit elements per byte (first element in LSB). This is used for asymmetric quantization with unsigned zero-point offsets.uint4 as a narrow integer NumPy extension (4-bit, stored in a byte, with defined cast behavior)..u4 packed types for unsigned 4-bit MMA instructions on Turing/Ampere. The MLIR AMDGPU dialect supports unsigned i4 via udot8 instructions on gfx906+.