I Use This When...
<!-- Practical use case -->History
One of the earliest clustering methods (1960s). Produces a tree (dendrogram) of nested clusters rather than a flat partition.
Why It Exists
k-Means requires you to choose k upfront. Hierarchical clustering builds a tree — you can cut it at any level to get any number of clusters.
How It Works
Visual Intuition
<!-- 3B1B-style animation description -->Step by Step
<!-- Algorithm walkthrough -->Code
# Implementation sketch
The Math Inside
Agglomerative (bottom-up): start with each point as its own cluster, merge the closest pair repeatedly. Linkage: single, complete, average, Ward's.