Blogs · Reinforcement Learning

Reinforcement Learning - Theoretical Foundations: Part II

RL Continued - Dynamic Programming

2021.01.05 · 3 min read · by Zhenlin Wang · updated 2022-08-19

Dynamic Programming in RL

Introduction

Synchronous DP

The following table summarizes the type of problems that is solved synchronously via iteration/evaluation algorithms:

ProblemBellman EquationAlgorithm
PredictionBellman Expectation EquationIterative Policy Evaluation
ControlBellman Expectation Equation Policy Iteration + Greedy Policy ImprovementPolicy Iteration
ControlBellman Optimality EquationValue Iteration

Iterative Policy Evaluation

Policy Improvement

Value Iteration

Contraction Mapping Theorem

Asynchronous DP

There are 3 simple ideas, which I haven’t learning in detail: