Gaussian Mixture Models

Blend multiple Gaussians to model complex data; fit with EM or variational tricks.

"Each point whispers, 'I belong,' mixtures hum a Gaussian song."

🥣

Mixture model basics

Assume each observation comes from one of $K$ components. For GMMs: $$p(\mathbf{x}) = \sum_{i=1}^{K} \pi_i \,\mathcal{N}(\mathbf{x} \mid \boldsymbol{\mu}_i, \boldsymbol{\Sigma}_i)$$ with weights $\pi_i$, means $\boldsymbol{\mu}_i$, covariances $\boldsymbol{\Sigma}_i$.

Latent component assignment per point.
Useful for clustering, density estimation, anomaly detection.
Estimate parameters via EM, MLE, or variational inference.

📈

Maximum likelihood for GMMs

Dataset $\{\mathbf{x}_1,\dots,\mathbf{x}_n\}$, parameters $\boldsymbol{\theta}=\{\pi_k,\boldsymbol{\mu}_k,\boldsymbol{\Sigma}_k\}_{k=1}^K$.

Log-likelihood: $$\ell(\boldsymbol{\theta})=\sum_{i=1}^n \log \sum_{k=1}^K \pi_k \mathcal{N}(\mathbf{x}_i \mid \boldsymbol{\mu}_k, \boldsymbol{\Sigma}_k)$$

EM steps

E-step responsibilities: $$\gamma_{ik} = \frac{\pi_k \mathcal{N}(\mathbf{x}_i \mid \boldsymbol{\mu}_k, \boldsymbol{\Sigma}_k)}{\sum_{j=1}^K \pi_j \mathcal{N}(\mathbf{x}_i \mid \boldsymbol{\mu}_j, \boldsymbol{\Sigma}_j)}$$ M-step updates: $$\pi_k = \frac{1}{n}\sum_{i=1}^n \gamma_{ik},\quad \boldsymbol{\mu}_k = \frac{\sum_{i=1}^n \gamma_{ik}\mathbf{x}_i}{\sum_{i=1}^n \gamma_{ik}},$$ $$\boldsymbol{\Sigma}_k = \frac{\sum_{i=1}^n \gamma_{ik}(\mathbf{x}_i - \boldsymbol{\mu}_k)(\mathbf{x}_i - \boldsymbol{\mu}_k)^\top}{\sum_{i=1}^n \gamma_{ik}}$$

EM climbs the log-likelihood and converges to a local maximum.

🌀

Variational inference

Approximate the posterior over latent assignments with a tractable family; optimize KL divergence (ELBO) via variational EM.

E-step: update variational distribution to tighten ELBO.
M-step: update $\pi_k, \boldsymbol{\mu}_k, \boldsymbol{\Sigma}_k$ using variational stats.
Faster than exact posteriors; may introduce bias but scales well.
Extensions: black-box VI, amortized VI for flexible inference networks.

💻

Example: fit a 3-component GMM

Generate 2D clusters and fit a 3-component full-covariance GMM with scikit-learn.

# Python example
import numpy as np
from sklearn.mixture import GaussianMixture
import matplotlib.pyplot as plt

np.random.seed(0)
n_samples = 500
X = np.concatenate((
    np.random.randn(n_samples, 2) * 0.5 + np.array([1, 1]),
    np.random.randn(n_samples, 2) * 0.5 + np.array([-2, 1]),
    np.random.randn(n_samples, 2) * 0.5 + np.array([0, -2])
))

gmm = GaussianMixture(n_components=3, covariance_type='full')
gmm.fit(X)

labels = gmm.predict(X)
plt.scatter(X[:, 0], X[:, 1], c=labels, s=40, cmap='viridis')
plt.show()

Try switching covariance_type to 'diag' or 'tied' and watch cluster shapes change.

🎵 Memory jingle

"Pick a $k$, stir Gaussians round,
Weights that sum, covariances sound.
E then M, responsibilities sway,
Mixtures hum their latent way."

🧠 Quick Quiz

In the E-step of EM for a GMM, what is computed?

✅

True / False speed round

🧪

Mini Lab: tweak a GMM

Click to see practical tweaks.

Choose a tweak to see an idea.

Wrap-up

"Mix the bells, fit the song,
E then M till peaks are strong.
Variational if you must,
Gaussians blend with gentle trust."

← Back to PGMs