I Use This When...
I need a mathematical description of uncertainty, noise, or repeated outcomes. In this bundle, the two key distributions are Bernoulli for yes/no labels and Gaussian for continuous values around an average.
Why It Exists
The "why" chain is:
- Raw data looks messy.
- We still want a compact description of how outcomes behave.
- A distribution says which outcomes are likely and which are rare.
- Once we have that, we can estimate parameters, compute likelihoods, and train models.
Distributions exist because ML needs a language for randomness, not just fixed numbers.
Visual Intuition
Think of two pictures:
- Bernoulli: a biased coin with only two outcomes,
0or1 - Gaussian: a bell curve centered around a typical value, with fewer points far from the center
Those two shapes cover a surprising amount of ML:
- logistic regression treats labels as Bernoulli outcomes
- Gaussian noise shows up in regression, sensor error, and many generative models
The Math Inside
Bernoulli
If X can only be 0 or 1, and P(X = 1) = p, then
P(X = x) = p^x (1 - p)^(1 - x)
for x in {0, 1}.
- mean:
p - variance:
p(1 - p)
This is the right model when the target is binary.
Gaussian
If a variable clusters around a mean mu with spread sigma^2, the density is
p(x) = (1 / sqrt(2 pi sigma^2)) * exp(- (x - mu)^2 / (2 sigma^2))
mu: centersigma^2: variance
The Gaussian matters because many natural measurement errors and aggregated effects look approximately bell-shaped.
Examples
- Spam or not spam: Bernoulli
- Click or no click: Bernoulli
- Height, temperature, sensor noise: often approximated by Gaussian
- Gaussian Naive Bayes: assumes each feature is Gaussian within each class
Code
import math
def bernoulli_pmf(x, p):
return (p ** x) * ((1 - p) ** (1 - x))
def gaussian_pdf(x, mu, sigma):
coef = 1.0 / math.sqrt(2 * math.pi * sigma * sigma)
exp_term = math.exp(-((x - mu) ** 2) / (2 * sigma * sigma))
return coef * exp_term
Used In
- Logistic Regression — Bernoulli likelihood for binary labels
- GMM — Mixture of Gaussians
- Naive Bayes — Gaussian NB
- Diffusion Models — Gaussian noise process
- t-SNE — Student's t-distribution