View on GitHub

Introduction to Reinforcement Learning

Material for an introduction course to reinforcement learning for compute scientists

Chapters 9-10: Prediction and Control using Approximation

Introduction

Explanation: What is the on-Policy State Distribution
Example: Semi-gradient TD(0) prediction with approximation using linear features for a simple grid world (no walls).
Example: Semi-gradient Sarsa(0) control with approximation using linear features for a simple grid world (no walls).

Examples Where Simple Linear Features Fail

Advanced Feature Construction

Linear approximation with Fourier basis features (4x3 Gridworld).
Linear approximation with Fourier basis features (L-Maze).
Richard Sutton’ s Tile Coding Software tiles3.py (retrieved from here). Code explanation created with ChatGPT.

Exercise

Value Function Approximation for the Lunar Lander Problem using Tile Coding

License

© 2026 Michael Hahsler. All code and documents in this repository are provided under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License.

CC BY-SA 4.0