I Use This When...
I want a generative model for high-quality images, audio, or multimodal outputs, and I care about training stability and sample quality. Diffusion became the default generative image family behind many modern image systems.
History
Ho, Jain, Abbeel (2020) — DDPM. Built on Sohl-Dickstein et al. (2015). Powers Stable Diffusion, DALL-E 2, Midjourney, Imagen.
Why It Exists
The "why" chain is:
- GANs can generate sharp samples but are unstable.
- We want a generative process with a smoother training signal.
- Gradually adding noise is easy to define.
- If we can learn to reverse that process, generation becomes denoising.
Diffusion models exist because reversing a simple noising process turned out to be much more stable than adversarial generation.
How It Works
Visual Intuition
Imagine taking a real image and adding a tiny bit of Gaussian noise over and over until it becomes pure static.
- the forward process destroys structure step by step
- the model learns the reverse process
- starting from noise, repeated denoising reconstructs a coherent image
So generation becomes "start from noise and clean it up."
The timeline node is here:
Step by Step
- Define a forward process that adds Gaussian noise over many timesteps
- Train a neural network to predict the noise or denoised sample at each step
- At inference time, start from random noise
- Repeatedly apply the learned reverse step
- End with a final generated sample
The model is not trying to jump from noise to image in one leap. It learns many small denoising moves.
Code
# concept sketch
# x_t = add_noise(x_0, t)
# pred_noise = model(x_t, t)
# loss = mse(pred_noise, true_noise)
The Math Inside
Forward process:
- start with data sample
x_0 - sample
x_tby gradually adding Gaussian noise
Reverse process:
p_theta(x_{t-1} | x_t)
The model learns a reverse transition that makes the sample slightly less noisy at each step.
A common training view is noise prediction:
- sample a timestep
t - corrupt a data sample with known Gaussian noise
- train the model to predict that noise
That objective is simple, differentiable, and stable, which is a major reason diffusion models became so successful.
Math Prerequisites
- Probability Distributions - Gaussian noise process
- Loss Functions - noise-prediction objective
- GAN - the adversarial alternative diffusion displaced in many image tasks
Related
- GAN — Adversarial alternative
- Autoencoder — Variational approach
- Distributions — Gaussian noise process