✂️

Principle of Optimality

The foundation of Dynamic Programming: Constructing optimal solutions from optimal sub-solutions.

🦎

1. The Tail Sub-problem

graph LR subgraph "Past (Fixed)" x0((x0)) -->|u0*| x1((...)) x1 -->|...| xk((xk*)) end subgraph "Tail Sub-problem" xk -->|uk| xk1((xk+1)) xk1 -->|...| xN((xN)) end style xk fill:#fcd34d,stroke:#f59e0b,stroke-width:4px style xN fill:#dcfce7,stroke:#16a34a

Concept

  • Let \( \{ u_0^*, \dots, u_{N-1}^* \} \) be an optimal control sequence with state sequence \( \{ x_1^*, \dots, x_N^* \} \).
  • Consider the tail sub-problem starting at \( x_k^* \) at time \( k \).
  • We want to minimize the "cost-to-go" from \( k \) to \( N \): \[ g_k(x_k^*, u_k) + \sum_{m=k+1}^{N-1} g_m(x_m, u_m) + g_N(x_N) \]
  • Key Insight: The tail of the optimal sequence \( \{ u_k^*, \dots, u_{N-1}^* \} \) is optimal for this sub-problem.
📜

2. Theorem: Principle of Optimality

Statement

Let \( \{u_0^*, \dots, u_{N-1}^*\} \) be an optimal control sequence, which together with \( x_0 \) determines the corresponding state sequence \( \{x_1^*, \dots, x_N^*\} \).

Consider the subproblem whereby we start at \( x_k^* \) at time \( k \) and wish to minimize the "cost-to-go" from time \( k \) to time \( N \), \[ g_k(x_k^*, u_k) + \sum_{m=k+1}^{N-1} g_m(x_m, u_m) + g_N(x_N), \] over \( \{u_k, \dots, u_{N-1}\} \) with \( u_m \in U_m(x_m) \).

Then the truncated optimal control sequence \( \{u_k^*, \dots, u_{N-1}^*\} \) is optimal for this subproblem.

🗣️

3. In Plain English

The principle of optimality suggests that the optimal cost function can be constructed in a piecemeal fashion going backwards.

1

First compute the optimal cost function for the "tail subproblem" involving the last stage.

2

Then solve the "tail subproblem" involving the last two stages.

3

Continue in this manner until the optimal cost function for the entire problem is constructed.

📝

4. Test Your Knowledge

1. If a path is optimal, what can we say about its tail?

2. In which direction does the Principle of Optimality suggest we solve the problem?

3. What defines a "Tail Sub-problem"?

Previous

Lecture 01

Next

Lecture 03