BF16 - Brain Float 16

Google Brain's 16-bit format: same range as FP32, designed for deep learning

Bit Layout

A BF16 number uses 16 bits divided into three fields:

BF16 uses the same 8-bit exponent as FP32, giving it identical dynamic range. The trade-off is a smaller 7-bit mantissa, providing less precision than FP16's 10 bits.

Overview

BF16 (Brain Floating Point 16) was developed by Google Brain for use in their TPU (Tensor Processing Unit) hardware. It has since been adopted by virtually every major ML hardware vendor including NVIDIA (Ampere and later), AMD, Intel, and ARM.

The key insight behind BF16 is that deep learning training is more sensitive to dynamic range than to precision. Neural network weights, activations, and gradients can span many orders of magnitude, and FP16's narrow range (max ~65K) often causes overflow. BF16 solves this by keeping FP32's full exponent range while truncating the mantissa.

Converting between FP32 and BF16 is trivially simple: just truncate the lower 16 bits of the FP32 representation. This makes BF16 extremely hardware-friendly.

Think of BF16 as "truncated FP32" BF16 is literally the upper 16 bits of an FP32 number. You can convert FP32 → BF16 by chopping off the lower 16 bits, and BF16 → FP32 by padding with 16 zeros.

Encoding Rules

Normal Numbers

value = (-1)sign × 2(exponent - 127) × (1 + mantissa / 27)

BF16 follows the same rules as FP32, but with only 7 mantissa bits instead of 23. The bias of 127 is calculated as 2(e-1) - 1, where e is the number of exponent bits (8), identical to FP32.

Subnormal Numbers

value = (-1)sign × 2-126 × (0 + mantissa / 27)

Special Values

Interactive Value Visualizer

Click any bit to flip it, drag the slider, or enter a decimal or hex value. The graphs show how values are distributed across the encoding space.

Dynamic Range & Precision

Special Values & Bit Patterns

Format Comparison

Where BF16 Is Used