🗣️

RL & DP Terminology

Decoding the language of Reinforcement Learning and Dynamic Programming.

🗿

1. The Rosetta Stone

Reinforcement Learning (RL) and Dynamic Programming (DP) often describe the same concepts using different words. Here is the translation guide:

Reinforcement Learning (RL) Dynamic Programming (DP) / Control
Environment System
Agent Decision Maker / Controller
Action Decision / Control
Reward (Negative) Cost
Value Function (Negative) Cost Function
Action Value (Q-Value) Q-Factor of State-Control Pair
🎮

2. Terminology Matcher

Click a term on the left (RL), then click its matching term on the right (DP).

🧠

3. Advanced Concepts

Planning vs. Learning

Planning: Solving a DP problem with a known model.

Learning: Solving a DP problem without an explicit model (using simulation/data).

Deep RL

Approximate DP using value and/or policy approximation with Deep Neural Networks.

Self-Learning / Self-Play

Solving a DP problem using some form of Policy Iteration (often optimistic).

Prediction

Equivalent to Policy Evaluation (finding the value of a fixed policy).

📝

4. Test Your Knowledge

1. In RL, "Maximizing Value" is equivalent to what in DP?

2. What is "Experience Replay"?

3. "System Identification" in Control is similar to what in RL?

Previous

Lecture 14

Next

Coming Soon