Deep Q-Network (DQN)

I Use This When...

History

Mnih et al. (DeepMind, 2013/2015). Played Atari games from raw pixels. Published in Nature. DeepMind acquired by Google for $500M shortly after.

Why It Exists

Q-table doesn't scale — can't store Q-values for every possible state (e.g., every possible screen pixel combination). Replace the table with a neural network.

How It Works

Visual Intuition

Step by Step

Code

# Implementation sketch

The Math Inside

Neural net approximates Q: Q(s,a;θ) ≈ Q*(s,a). Key tricks: experience replay (store transitions, sample randomly) + target network (stabilize training by updating slowly).

Math Prerequisites

Q-Learning — The tabular foundation
Policy Gradient / PPO — Policy-based alternative
MLP & Backprop — The neural network inside