1. Joint Probability Mass Function (PMF)
For two discrete random variables \( X \) and \( Y \), the Joint PMF gives the probability that \( X \) takes a specific value \( x \) AND \( Y \) takes a specific value \( y \) at the same time.
Just like a single PMF sums to 1, the sum of all joint probabilities over all possible pairs \((x,y)\) must be 1. $$ \sum_x \sum_y P_{XY}(x,y) = 1 $$
2. Visualizing Joint Probabilities
Imagine a grid where the height of each bar represents the probability of that specific \((x,y)\) pair occurring. Interact with the grid below to see how the total probability is distributed.
Interactive Joint PMF Table
Hover over cells to see the probability. Click "Calculate Marginals" to see the sums for each row and column.
Marginal Probabilities
Conditionals
3. Marginal Distributions
Sometimes we have the joint distribution but we only care about one variable. We can "marginalize" the other variable out by summing over all its possible values.
Marginal PMF of X
$$ P_X(x) = \sum_y P_{XY}(x,y) $$"Squashing" the table horizontally.
Marginal PMF of Y
$$ P_Y(y) = \sum_x P_{XY}(x,y) $$"Squashing" the table vertically.
4. Joint Cumulative Distribution Function (CDF)
The Joint CDF, denoted as \( F_{XY}(x,y) \), gives the probability that \( X \le x \) AND \( Y \le y \).
It accumulates probability from the bottom-left "corner" up to the point \((x,y)\). It is a non-decreasing function in both \(x\) and \(y\). $$ F_{XY}(\infty, \infty) = 1, \quad F_{XY}(-\infty, y) = 0, \quad F_{XY}(x, -\infty) = 0 $$
5. Conditional Distributions
If we know \( Y = y \), the distribution of \( X \) changes. This is the Conditional PMF.
It's like slicing the joint distribution grid at a specific row or column and re-normalizing so it sums to 1.
6. Conditional Expectation
The expected value of \( X \) given that \( Y = y \) is the Conditional Expectation.
Note that \( E[X|Y] \) is itself a random variable because it depends on the value of \( Y \).
7. Law of Total Expectation
The overall expectation of \( X \) is the weighted average of its conditional expectations.
Find the Flaw
A student claims that if \( P(X=x) \) and \( P(Y=y) \) are known, then \( P_{XY}(x,y) \) is simply \(
P(X=x) \times P(Y=y) \).
Is this always true?
NO! That's only true if X and Y are Independent.
If \( X \) and \( Y \) are dependent (correlated), knowing \( X \) gives you information about
\( Y \), so you can't just multiply their marginal probabilities.
Example: Let \( X \) be "It rains" and \( Y \) be "Ground is wet". \( P(X,Y) \) is high. But \(
P(X) \times P(Y) \) might be lower if we ignore the connection.