Joint Distributions - Probability & Statistics

1. Joint Probability Mass Function (PMF)

For two discrete random variables $ X $ and $ Y $, the Joint PMF gives the probability that $ X $ takes a specific value $ x $ AND $ Y $ takes a specific value $ y $ at the same time.

$$ P_{XY}(x,y) = P(X=x, Y=y) $$

Just like a single PMF sums to 1, the sum of all joint probabilities over all possible pairs $(x,y)$ must be 1. $$ \sum_x \sum_y P_{XY}(x,y) = 1 $$

2. Visualizing Joint Probabilities

Imagine a grid where the height of each bar represents the probability of that specific $(x,y)$ pair occurring. Interact with the grid below to see how the total probability is distributed.

Interactive Joint PMF Table

Hover over cells to see the probability. Click "Calculate Marginals" to see the sums for each row and column.

X \ Y

1/6

P(0,0) ≈ 0.167

1/4

P(0,1) = 0.25

1/8

P(0,2) = 0.125

1/8

P(1,0) = 0.125

1/6

P(1,1) ≈ 0.167

1/6

P(1,2) ≈ 0.167

Marginal Probabilities

P(X=x) (Summing Rows)

P(X=0) -

P(X=1) -

P(Y=y) (Summing Cols)

P(Y=0) -

P(Y=1) -

P(Y=2) -

Conditionals

3. Marginal Distributions

Sometimes we have the joint distribution but we only care about one variable. We can "marginalize" the other variable out by summing over all its possible values.

Marginal PMF of X

$$ P_X(x) = \sum_y P_{XY}(x,y) $$

"Squashing" the table horizontally.

Marginal PMF of Y

$$ P_Y(y) = \sum_x P_{XY}(x,y) $$

"Squashing" the table vertically.

4. Joint Cumulative Distribution Function (CDF)

The Joint CDF, denoted as $ F_{XY}(x,y) $, gives the probability that $ X \le x $ AND $ Y \le y $.

$$ F_{XY}(x,y) = P(X \le x, Y \le y) $$

It accumulates probability from the bottom-left "corner" up to the point $(x,y)$. It is a non-decreasing function in both $x$ and $y$. $$ F_{XY}(\infty, \infty) = 1, \quad F_{XY}(-\infty, y) = 0, \quad F_{XY}(x, -\infty) = 0 $$

5. Conditional Distributions

If we know $ Y = y $, the distribution of $ X $ changes. This is the Conditional PMF.

$$ P_{X|Y}(x|y) = \frac{P_{XY}(x,y)}{P_Y(y)} $$

It's like slicing the joint distribution grid at a specific row or column and re-normalizing so it sums to 1.

6. Conditional Expectation

The expected value of $ X $ given that $ Y = y $ is the Conditional Expectation.

$$ E[X|Y=y] = \sum_x x \cdot P_{X|Y}(x|y) $$

Note that $ E[X|Y] $ is itself a random variable because it depends on the value of $ Y $.

7. Law of Total Expectation

The overall expectation of $ X $ is the weighted average of its conditional expectations.

$$ E[X] = E[E[X|Y]] = \sum_y E[X|Y=y] \cdot P_Y(y) $$

Find the Flaw

A student claims that if $ P(X=x) $ and $ P(Y=y) $ are known, then $ P_{XY}(x,y) $ is simply $ P(X=x) \times P(Y=y) $.

Is this always true?