Chapters 17 and 22: Reinforcement Learning and MDPs
Chapter 17: MDPs
- 4x3 Grid World: Markov Decision Processes solved with Value Iteration and Policy Iteration (in R)
- L-Maze: Solving a Maze using Value Iteration (in R with package markovDP)
- Connection to playing games (Chapter 5): Finding the Optimal Policy to Play Tic-Tac-Toe with Value Iteration implements value iteration to find the optimal policy to play the game. (Python)
Chapter 22: Reinforcement Learning (RL)
- 4x3 Grid World: A Q-Learning Agent (in R)
- L-Maze: Solving a Maze using Q-Learning (in R with package markovDP)
- Connection to playing games (Chapter 5): Learning to Play Tic-Tac-Toe with Q-Learning implements a simple table-based Q-learning algorithm to play the game. (Python)
More on Reinforcement Learning
More on RL can be found in the course material Reinforcement Learning: Lecture Material, Simple Python Code Examples and Assignments
Other Software (Python)
- Gymnasium is an open source Python library for developing and comparing reinforcement learning algorithms.
- Stable Baselines3 is a Deep Reinforcement Learning library.
License
© 2025-2026 Michael Hahsler. All code and documents in this repository are provided under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License.