🔭

One-Step & Multistep Lookahead

Exploring different horizons for approximation in value space.

Approximation Methods Overview

1. Ways of Approximation

We've established that we need to approximate the optimal cost-to-go function \( J^* \). But how far should we look ahead before applying this approximation?

1️⃣

One-Step Lookahead

Look one stage ahead, then approximate.

🔢

Multistep Lookahead

Look \( \ell \) stages ahead, then approximate.

👀

2. One-Step Lookahead

Definition

The algorithm involves a one-step lookahead minimization if it solves a one-stage DP problem for each \( k \):

\[ \tilde{u}_k = \arg\min_{u_k \in U_k(\tilde{x}_k)} \left[ g_k(\tilde{x}_k, u_k) + \tilde{J}_{k+1} (f_k(\tilde{x}_k, u_k)) \right] \]
  • At state \( k \), you look ahead to various possible next states \( x_{k+1} \).
  • Minimization is performed over controls \( u_k \).
  • We require the approximate cost \( \tilde{J}_{k+1} \).
🪜

3. Multistep Lookahead

Multistep lookahead involves solving an \( \ell \)-step DP problem, where \( 1 < \ell < N - k \), with a terminal cost function approximation \( \tilde{J}_{k+\ell} \).

Why use Multistep?

It typically provides better performance. The intuitive reason is that by treating \( \ell \) stages "exactly" via optimization, the effect of the approximation error \( \tilde{J}_{k+\ell} - J^*_{k+\ell} \) tends to diminish as \( \ell \) increases.

Example: In AlphaZero chess, long multistep lookahead is critical for superior performance.

Trade-off:

The solution becomes more time-consuming compared to one-step lookahead.

♟️

4. Example: Chess

Analyzing the Game

  • ?
    States:

    The arrangement of pieces on the board.

  • ?
    Actions (Controls):

    Legal moves available to the current player.

  • ?
    Lookahead:

    Calculating potential future board states. Grandmasters might look 10-15 steps ahead (Multistep), while novices might only look 1 step ahead.

But how do I compute \( \tilde{J} \)?

We've talked about using \( \tilde{J} \) for approximation, but we haven't discussed where this function comes from.

Find out in the next lecture →
🧠

Test Your Knowledge

1. Which lookahead typically offers better performance?

2. What is the main downside of Multistep Lookahead?

3. In the Chess example, what corresponds to the "Control" \( u_k \)?

Previous

Lecture 08

Next

Lecture 10