Chapters 17 and 22: Reinforcement Learning and MDPs
Chapter 17: MDPs
- Example: 4x3 Grid World: Markov Decision Processes solved with Value Iteration and Policy Iteration (in R)
- Example: L-Maze: Solving a Maze using RL (Value Iteration) (in R with package markovDP)
Chapter 22: Reinforcement Learning
- Example: 4x3 Grid World: A Q-Learning Agent (in R)
- Example: L-Maze: Solving a Maze using RL (Q-Learning) (in R with package markovDP)
- Connection to playing games (Chapter 5): Learning to Play Tic-Tac-Toe with Q-Learning implements a simple table-based Q-learning algorithm to play the game. (Python)
- Connection to playing games (Chapter 5): Learning the Optimal Policy to Play Tic-Tac-Toe with Value Iteration implements value iteration to find the optimal policy to play the game. (Python)
More on Reinforcement Learning
These examples implement methods described in the book Reinforcement Learning: An Introduction by Sutton and Barto (2020).
- Example: 4x3 Grid World: Monte Carlo Control (in R)
- Example: 4x3 Grid World: TD Control with Sarsa, Q-Learning and Expected Sarsa (in R)
- R package: markovDP
Other Software (Python)
- Gymnasium is an open source Python library for developing and comparing reinforcement learning algorithms.
- CleanRL is a Deep Reinforcement Learning library.
License
All code and documents in this repository is provided under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License