Bayesian Inference

Updating beliefs with data: Priors, Posteriors, and MAP Estimation.

1 Introduction

In Frequentist Inference, we assume the unknown parameter \( \theta \) is a fixed constant. We estimate it using data (e.g., sample mean).

In Bayesian Inference, we treat the unknown parameter \( \Theta \) as a random variable.

We start with a Prior Distribution \( f_\Theta(\theta) \) representing our initial belief.
We observe data \( D \).
We update our belief to get the Posterior Distribution \( f_{\Theta|D}(\theta|D) \) using Bayes' Rule.

2 Bayes' Theorem for Inference

\[ f_{\Theta|D}(\theta|D) = \frac{P(D|\theta) f_\Theta(\theta)}{P(D)} \]

Prior \( f_\Theta(\theta) \)

What we think about \( \theta \) before seeing data.

Likelihood \( P(D|\theta) \)

How probable the data is for a given \( \theta \).

Posterior \( f_{\Theta|D}(\theta|D) \)

Our updated belief about \( \theta \) after seeing data.

Evidence \( P(D) \)

Normalization constant (total probability of data).

3 Interactive: Bayesian Updater

Let's estimate the bias of a coin \( \theta \) (probability of Heads). Start with a Prior belief, then flip the coin and watch the Posterior update!

1. Choose Prior Belief:

2. Observe Data:

Data Observed:

Heads: 0, Tails: 0

4 Point Estimation

Once we have the posterior distribution, how do we choose a single value estimate \( \hat{\theta} \)?

Maximum A Posteriori (MAP)

The value of \( \theta \) that maximizes the posterior PDF/PMF (the mode).

\[ \hat{\theta}_{MAP} = \arg \max_\theta f_{\Theta|D}(\theta|D) \]

Minimum Mean Squared Error (MMSE)

The mean of the posterior distribution.

\[ \hat{\theta}_{MMSE} = E[\Theta | D] \]

5 Check Your Understanding

1. If you use a Uniform Prior (flat everywhere), the MAP estimate is equivalent to:

← Previous: Statistical Inference Back to Syllabus ↑