View on GitHub

Introduction to Reinforcement Learning

Material for an introduction course to reinforcement learning for compute scientists

Chapters 4: Dynamic Programming

Algorithm Implementation Using `gym-classic`

Value Iteration and Policy Iteration for Russel and Norvig’s 4x3 grid world. This notebook discusses the implementation of the algorithms.
Value Iteration and Policy Iteration for the L-maze. This notebook investigates the behavior of the algorithms.
Assignment: Use Dynamic Programming Methods to Solve the Teleport Maze

Model and Algorithm Implementation (from scratch)

Modeling Tic-Tac-Toe as an MDP and applying Value Iteration. Defines the MDP from scratch and applies value iteration to find the optimal policy to play the game.

License

© 2025 Michael Hahsler. All code and documents in this repository are provided under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License.

CC BY-SA 4.0