INT4 - 4-Bit Signed Integer

Sub-byte signed integer: extreme quantization for LLM inference and edge deployment

Bit Layout

A 4-bit integer (also called a nibble) stores one of 16 possible values. For INT4, two's complement gives the range -8 to 7. For the unsigned variant (0–15), see UINT4.

Overview

4-bit integers are a sub-byte format, meaning two INT4 values can be packed into a single byte. While not a standard hardware data type on most processors, INT4 has become critically important in machine learning for extreme quantization of large language models (LLMs).

With only 16 possible values, INT4 provides 8× compression over FP32 and 2× over INT8. This makes it possible to run large models (7B, 13B, 70B parameters) on consumer GPUs and mobile devices that would otherwise require enterprise hardware.

INT4 in LLM Quantization

Techniques like GPTQ, AWQ, and GGML/GGUF Q4 quantize LLM weights to 4 bits. The typical approach:

  1. Group weights into blocks (e.g., 32 or 128 values per block)
  2. Compute a per-block scale factor (stored in higher precision)
  3. Quantize each weight to the nearest INT4 value
  4. At inference time, dequantize on-the-fly: weight = int4_value × scale
A nibble = half a byte A 4-bit value is traditionally called a "nibble" (or "nybble"). One byte contains exactly two nibbles. Each nibble corresponds to one hexadecimal digit (0–F), which is why hex is so convenient for displaying binary data.

Range & Properties

All Representable Values

With only 16 possible bit patterns, here is the complete set:

Interactive Bit Visualizer

With only 4 bits, you can click through every possible value. Try flipping the sign bit (leftmost) to see two's complement in action.

Format Comparison

Where INT4 Is Used