Chapters 4: Dynamic Programming
Algorithm Implementation Using gym-classic
- Value Iteration and Policy Iteration for Russel and Norvig’s 4x3 grid world. This notebook discusses the implementation of the algorithms.
- Value Iteration and Policy Iteration for the L-maze. This notebook investigates the behavior of the algorithms.
- Assignment: Use Dynamic Programming Methods to Solve the Teleport Maze
Model and Algorithm Implementation (from scratch)
- Modeling Tic-Tac-Toe as an MDP and applying Value Iteration. Defines the MDP from scratch and applies value iteration to find the optimal policy to play the game.
License
© 2025 Michael Hahsler. All code and documents in this repository are provided under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License.