Math for ML — The Complete Map

Why Math?

Every ML algorithm is an equation being optimized. Understanding the math lets you:

Debug models (why is loss not decreasing?)
Choose the right algorithm (what assumptions does it make?)
Read papers (the field communicates in math)
Invent new approaches (you can't improve what you don't understand)

The Four Pillars

Probability & Statistics     "How certain are we?"
  → Bayes, distributions, MLE
  → Used in: Naive Bayes, GMM, Bayesian methods

Linear Algebra               "How do we represent and transform data?"
  → Vectors, matrices, eigenvalues
  → Used in: PCA, neural networks, SVD

Calculus & Optimization      "How do we find the best parameters?"
  → Gradients, chain rule, gradient descent
  → Used in: literally everything that trains

Information Theory           "How do we measure uncertainty?"
  → Entropy, KL divergence, cross-entropy
  → Used in: decision trees, loss functions, VAEs

Topics

Probability & Statistics

Bayes' Theorem — Updating beliefs with evidence
Probability Distributions — Gaussian, Bernoulli, Poisson, etc.
MLE & MAP — Finding the most likely parameters
Conditional Independence — The 'naive' in Naive Bayes

Linear Algebra

Vectors & Matrices — The language of data
Eigenvalues & Eigenvectors — Directions that don't change
Dot Product & Projection — Similarity and shadows
Matrix Decomposition (SVD) — Breaking matrices apart

Calculus & Optimization

Derivatives & Gradients — Slope in multiple dimensions
Chain Rule — Why backpropagation works
Partial Derivatives — Changing one variable at a time
Taylor Approximation — Local approximations (used in XGBoost)
Gradient Descent — Walking downhill
Convex Optimization — When there's one global minimum
Lagrange Multipliers — Optimization with constraints (SVM)
Constrained Optimization — The general framework

Information Theory

Entropy — Measuring surprise
KL Divergence — Distance between distributions
Cross-Entropy — The most common classification loss

Math for ML — The Complete Map

Why Math?

The Four Pillars

Topics

Probability & Statistics

Linear Algebra

Calculus & Optimization

Information Theory

Linked from