Chapters 6: Temporal-Difference Learning
Examples
- Sarsa and Q-Learning for the Cliff Walking Environment (using
gym-classic). - Learning to Play Tic-Tac-Toe with Q-Learning implements a simple table-based Q-learning algorithm to play the game from scratch.
License
© 2025 Michael Hahsler. All code and documents in this repository are provided under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License.