I Use This When...
I want a simple local classifier with almost no training phase, especially on small tabular or low-dimensional problems where neighborhoods are meaningful. It is also one of the clearest ways to visualize what classification means.
History
Cover & Hart (1967). One of the oldest and simplest ML algorithms. No training phase — just memorize the data.
Why It Exists
The "why" chain is:
- maybe global decision boundaries are too rigid
- maybe local neighborhoods already tell us the label
- if similar points tend to share labels, nearby examples can vote
- so instead of fitting parameters, we store data and reason locally
k-NN exists because sometimes classification can be reduced to "who lives near this point?"
How It Works
Visual Intuition
Imagine a scatter plot with colored points.
- a new point appears
- we draw a circle around its nearest neighbors
- those neighbors cast votes
- the local majority decides the class
As k changes, the boundary changes too:
- small
kmakes the boundary very flexible - larger
ksmooths the boundary
The timeline node is here:
Step by Step
- Store the labeled training points
- Choose a distance metric such as Euclidean distance
- For a new point, compute distance to all training points
- Select the
kclosest neighbors - Predict by majority vote or distance-weighted vote
Code
def knn_predict(neighbors):
votes = {}
for label in neighbors:
votes[label] = votes.get(label, 0) + 1
return max(votes, key=votes.get)
The Math Inside
There is no parametric model to fit.
Prediction rule:
y_hat = majority_vote(labels of k nearest neighbors)
A common distance is Euclidean:
d(x, z) = sqrt(sum (x_i - z_i)^2)
Important tradeoff:
- small
k-> low bias, high variance - large
k-> higher bias, lower variance
Why k-NN struggles in high dimension:
- distances become less informative
- everything starts to look far from everything else
So k-NN works best when neighborhoods are genuinely meaningful.
Math Prerequisites
- Vectors & Matrices - points as vectors
- Dot Product - geometry and similarity intuition
- Bias-Variance Tradeoff - how
kchanges smoothness
Related
- Decision Tree — Another interpretable classifier
- Curse of Dimensionality — Why k-NN fails in high dimensions
- Vectors & Distance — Distance functions