🌫️

Partial State Information & Belief State

Making optimal decisions when the world is not fully visible.

🤔

1. What is a State?

POMDP Infographic

MDP Perspective

"A physical system characterized at any stage by a small set of parameters." - Bellman (1957)

RL Perspective

"The state must include all aspects of the past that make a difference for the future." - Sutton & Barto

Professor Powell's Definition

Optimization Version: A function of history necessary to compute the cost, constraints, and future transitions.

📊

2. Types of State Variables

📦

Physical (\(R_t\))

Inventory, Location, Battery Level.

📈

Informational (\(I_t\))

Market Prices, Weather Forecasts.

🧠

Belief (\(B_t\))

Probabilistic beliefs about uncertain quantities.

Example: Clinical Trial Modeling

  • Physical: Number of patients, dosage levels.
  • Informational: Medical guidelines, regulatory approvals.
  • Belief: Effectiveness of drug (Bayesian), likelihood of side effects.
🌫️

3. Why Partial Observation?

In the real world, we rarely have perfect information. Sensors are flawed, measurements are costly, or states are hidden.

🚶

Man in Fog

Limited visibility of surroundings.

🤖

Robot in Smoke

Sensors obscured by environment.

♟️🙈

Blindfolded Chess

State must be maintained in memory (Belief).

🚗💥

Miscalculated Belief

Fatal errors due to wrong assumptions.

🧠

4. POMDP & Belief State

POMDP (Partially Observed Markov Decision Process): The framework for decision making when state information is imperfect.

The Solution: Sufficient Statistic

To solve a POMDP, we convert it to a problem with perfect information using a Sufficient Statistic.

The most common sufficient statistic is the Belief State (\(b_k\)):

  • Represents the probability distribution of \(x_k\) given all history.
  • Updated using Bayes' rule (e.g., Kalman Filter, Particle Filter).
📝

5. Test Your Knowledge

1. In a self-driving car, "Traffic Light Color" is which type of variable?

2. "Effectiveness of a new drug" in a trial is best modeled as:

3. Why is exact solution of POMDPs difficult?

4. What is a "Sufficient Statistic"?

5. Which method is used to update belief states?

Previous

Lecture 09

Next

Coming Soon