Chapter 5: Adversarial Search and Games
Contents
Defining a Game
- Example Tic-Tac-Toe
Connection to search with nondeterministic actions (from Chapter 4.3)
- Example: Solving Tic-Tac-Toe with And-Or-Tree Search. Here the opponent is seen as part of the environment, i.e., each action by the player is followed by an unknown action of the opponent which, from the viewpoint of the player makes the outcomes of actions nondeterministic.
Solving games using adversarial search
- Example: Solving Tic-Tac-Toe with Minimax Search and Alpha-Beta Pruning
- Example: Solving Tic-Tac-Toe with Heuristic Alpha-Beta Tree Search
- Example: Solving Tic-Tac-Toe with Pure Monte Carlo Search
- Example: Solving Tic-Tac-Toe with Pure Monte Carlo Search + UCB1 Selection Policy
Interactive game play
Assignments
- Assignment: Adversarial Search: Playing Connect 4
- Assignment: Adversarial Search: Playing “Mean” Connect 4
- Assignment: Adversarial Search: Playing Dots and Boxes
Connection to Machine Learning (more in Chapter 19)
- The example Learn to Score a Tic-Tac-Toe Board by Example uses a neural network trained on self-play data to lean an evaluation function that can be used as a heuristic in Heuristic Minimax Search or as a playout policy for Monte Carlo Search.
Connection to Reinforcement Learning (more in Chapter 22)
- The example Learning to Play Tic-Tac-Toe with Q-Learning implements a simple table-based Q-learning algorithm to play the game.
License
All code and documents in this repository is provided under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License