Bayesian Inference
Updating beliefs with data: Priors, Posteriors, and MAP Estimation.
1 Introduction
In Frequentist Inference, we assume the unknown parameter \( \theta \) is a fixed constant. We estimate it using data (e.g., sample mean).
In Bayesian Inference, we treat the unknown parameter \( \Theta \) as a random variable.
- We start with a Prior Distribution \( f_\Theta(\theta) \) representing our initial belief.
- We observe data \( D \).
- We update our belief to get the Posterior Distribution \( f_{\Theta|D}(\theta|D) \) using Bayes' Rule.
2 Bayes' Theorem for Inference
\[ f_{\Theta|D}(\theta|D) = \frac{P(D|\theta) f_\Theta(\theta)}{P(D)} \]
Prior \( f_\Theta(\theta) \)
What we think about \( \theta \) before seeing data.
Likelihood \( P(D|\theta) \)
How probable the data is for a given \( \theta \).
Posterior \( f_{\Theta|D}(\theta|D) \)
Our updated belief about \( \theta \) after seeing data.
Evidence \( P(D) \)
Normalization constant (total probability of data).
3 Interactive: Bayesian Updater
Let's estimate the bias of a coin \( \theta \) (probability of Heads). Start with a Prior belief, then flip the coin and watch the Posterior update!
Data Observed:
Heads: 0, Tails: 0
4 Point Estimation
Once we have the posterior distribution, how do we choose a single value estimate \( \hat{\theta} \)?
Maximum A Posteriori (MAP)
The value of \( \theta \) that maximizes the posterior PDF/PMF (the mode).
\[ \hat{\theta}_{MAP} = \arg \max_\theta f_{\Theta|D}(\theta|D) \]
Minimum Mean Squared Error (MMSE)
The mean of the posterior distribution.
\[ \hat{\theta}_{MMSE} = E[\Theta | D] \]
5 Check Your Understanding
1. If you use a Uniform Prior (flat everywhere), the MAP estimate is equivalent to: