Comparing Different Search Strategies: Maze¶
Notes on breaking ties:
- The order in which the children are explored (see
available directions) makes a big difference for DFS and dealing with empty spaces. I explore the directions in random order which makes the algorithm stochastic! - Ties for $h(n)$ and $f(n)$ need to be broken in a consistent manner. I use the most recently added node. To try to keep moving into the same direction.
%run maze_helper.py
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import numpy as np
#f = open("small_maze.txt", "r")
#f = open("medium_maze.txt", "r")
#f = open("large_maze.txt", "r") # this has only one solution!
#f = open("open_maze.txt", "r")
#f = open("empty_maze.txt", "r")
#f = open("empty_maze_2.txt", "r")
#f = open("loops_maze.txt", "r")
f = open("L_maze.txt", "r")
maze_str = f.read()
maze = parse_maze(maze_str)
# look at two positions in the maze
print("Position(0,0):", maze[0, 0])
print("Position(8,1):", maze[8, 1])
show_maze(maze)
Helper functions for the Maze Assignment by M. Hahsler Usage: import maze_helper as mh mh.show_some_mazes() Here is an example maze: XXXXXXXXXXXXXXXXXXXXXX X XX X X X X XXXXXX X XXXXXX X XXXXXX S X X X X XXXXXX XX XXXXX X XXXX X X X X XXX XXX X X XXXXXXXXXX XXXXXX X XG XX X XXXXXXXXXXXXXXXXXXXXXX The goal is at (np.int64(8), np.int64(1)). Position(0,0): X Position(8,1):
<Figure size 640x480 with 0 Axes>
Implementation¶
My implementation follows the pseudo code from the slides/textbook.
# tree_search_solution.py has my actual implementation (not published)
import tree_search_solution as ts
# order in which we add new states to the frontier
ts.set_order("NESW")
#ts.set_order(random=True)
Directions are checked in the order ['N', 'E', 'S', 'W']
Experiments¶
BFS¶
ts.set_order("NESW")
#ts.set_order(random=True)
%time result = ts.best_first_search(maze, strategy = "BFS", debug = False, vis = False)
ts.show_path(maze, result)
result['actions']
Directions are checked in the order ['N', 'E', 'S', 'W'] CPU times: user 2 ms, sys: 93 μs, total: 2.09 ms Wall time: 2.1 ms Path length: 16 Reached squares: 151
['E', 'E', 'E', 'E', 'S', 'E', 'E', 'N', 'N', 'N', 'N', 'N', 'N', 'N', 'N', 'E']
DFS¶
This implementation uses not reached data structure and has space complexity $O(bm)$ instead of $O(b^m)$ when we reuse the tree search algorithm from BFS!
We need to check for all cycles. If we do not break all cycles correctly, then we will end up in an infinite loop. Here are possible solutions:
- Stop after a fixed number of tries and return no solution
max_tries. - IDS solves this problem.
Note on the visualization: I use gray for areas that the algorithm has explored, but DFS has already removed it from memory!
# Note: DFS uses LIFO so the directions come from the stack in reverse order!
# Can get stuck for empty maze since cycle checking is not string enough!
# I use a maximum number of tries and stop if the goal is not reached.
ts.set_order("NESW")
#ts.set_order("SENW")
#ts.set_order("WSEN")
#ts.set_order(random=True)
%time result = ts.DFS(maze, vis = False, max_tries = 100000, debug_reached = True)
#result
ts.show_path(maze, result)
if result['path'] is None:
print("No solution found!")
Directions are checked in the order ['N', 'E', 'S', 'W'] CPU times: user 2.84 ms, sys: 0 ns, total: 2.84 ms Wall time: 2.89 ms Path length: 122 Reached squares: 145
Do the same, but change the order in which we explore states (add them to them as nodes to the frontier).
#ts.set_order("NESW")
ts.set_order("SENW")
#ts.set_order("WSEN")
#ts.set_order(random=True)
%time result = ts.DFS(maze, vis = False, max_tries = 100000, debug_reached = True)
#result
ts.show_path(maze, result)
if result['path'] is None:
print("No solution found!")
Directions are checked in the order ['S', 'E', 'N', 'W'] CPU times: user 448 μs, sys: 22 μs, total: 470 μs Wall time: 473 μs Path length: 26 Reached squares: 53
We could do a random walk and not check for cycles. This is guaranteed to reach eventually any square including the goal, but creates a long path. The path could be simplified leading to the Tremaux's algorithm.
ts.set_order(random = True)
# run it multiple times to see the effect of randomization
for _ in range(5):
%time result = ts.DFS(maze, check_cycle = False, max_tries = 100000, vis = False, debug_reached = True)
#result
ts.show_path(maze, result)
if result['path'] is None:
print("No solution found!")
Directions are checked at every step in random order. CPU times: user 6.43 ms, sys: 115 μs, total: 6.55 ms Wall time: 6.32 ms Path length: 144 Reached squares: 96
CPU times: user 22.4 ms, sys: 0 ns, total: 22.4 ms Wall time: 22.1 ms Path length: 1110 Reached squares: 148
CPU times: user 16.7 ms, sys: 18 μs, total: 16.8 ms Wall time: 16.6 ms Path length: 832 Reached squares: 126
CPU times: user 3.85 ms, sys: 0 ns, total: 3.85 ms Wall time: 3.85 ms Path length: 254 Reached squares: 95
CPU times: user 73.2 ms, sys: 0 ns, total: 73.2 ms Wall time: 72.9 ms Path length: 3502 Reached squares: 154
Run randomized DFS multiple times and use the best solution.¶
Note: IDS takes a similar amount of time and memory, but is guaranteed optimal.
ts.set_order(random = True)
N = 100
%time results = [ ts.DFS(maze, max_tries = 10000, vis = False) for _ in range(N) ]
# check if we found a solution and display the best solution
results = [ r for r in results if not r['path'] is None ]
if len(results) > 0:
path_lengths = [ len(r['path'])-1 for r in results ]
print(f"Solutions have path_lengths of {path_lengths}")
result = results[ts.min_index(path_lengths)]
ts.show_path(maze, result)
else:
print("No solution found!")
Directions are checked at every step in random order. CPU times: user 165 ms, sys: 7.63 ms, total: 173 ms Wall time: 171 ms Solutions have path_lengths of [78, 90, 76, 68, 74, 40, 82, 64, 62, 52, 64, 38, 82, 78, 70, 66, 56, 72, 72, 82, 42, 50, 78, 84, 72, 40, 78, 60, 70, 70, 82, 64, 94, 74, 30, 42, 82, 48, 36, 50, 68, 82, 84, 30, 68, 84, 58, 78, 74, 80, 56, 52, 78, 76, 64, 72, 54, 76, 64, 84, 60, 66, 60, 52, 32, 88, 52, 98, 78, 82, 58, 86, 88, 112, 94, 52, 42, 68, 92, 90, 56, 48, 52, 92, 40, 72, 86, 72, 54, 60, 62, 50, 76, 88, 72, 92, 48, 80, 78, 44] Path length: 30 Reached squares: 0
Depth limited DFS¶
Note: The frontier needs to be checked differently during cycle checking!
ts.set_order(random = True)
%time result = ts.DFS(maze, limit = 5, frontier_option = 2, max_tries = 100000, vis = False, debug_reached = True)
ts.show_path(maze, result)
Directions are checked at every step in random order. CPU times: user 1.72 ms, sys: 93 μs, total: 1.82 ms Wall time: 1.82 ms Reached squares: 57
IDS¶
Notes:
IDS with DFS does not store reached squares, so gray areas are not shown!
IDS depends on the cycle checking of DFS and therefore is also affected by these problems.
ts.set_order(random = True)
%time result = ts.IDS(maze, frontier_option = 2, max_tries = 100000)
ts.show_path(maze, result)
Directions are checked at every step in random order. CPU times: user 2.62 s, sys: 3.91 ms, total: 2.63 s Wall time: 2.63 s Path length: 16 Reached squares: 0
Greedy Best-First Search (GBFS)¶
# set the heuristic to Manhattan distance
ts.heuristic = ts.manhattan
ts.set_order(random=True)
%time result = ts.best_first_search(maze, strategy = "GBFS", debug = False, vis = False)
ts.show_path(maze, result)
Directions are checked at every step in random order. CPU times: user 1.77 ms, sys: 0 ns, total: 1.77 ms Wall time: 1.78 ms Path length: 18 Reached squares: 58
A* Search¶
ts.set_order(random=True)
%time result = ts.best_first_search(maze, strategy = "A*", debug = False, vis = False)
ts.show_path(maze, result)
Directions are checked at every step in random order. CPU times: user 1.88 ms, sys: 0 ns, total: 1.88 ms Wall time: 1.89 ms Path length: 16 Reached squares: 60
Weighted A* Search¶
$W > 1$ tends towards GBFS (optimality is not guaranteed)
ts.set_order(random=True)
%time result = ts.best_first_search(maze, strategy = "A*", W = 1+1e-9, debug = False, vis = False)
ts.show_path(maze, result)
Directions are checked at every step in random order. CPU times: user 2.82 ms, sys: 0 ns, total: 2.82 ms Wall time: 2.83 ms Path length: 16 Reached squares: 61
%time result = ts.best_first_search(maze, strategy = "A*", W = 5, debug = False, vis = False)
ts.show_path(maze, result)
CPU times: user 1.45 ms, sys: 69 μs, total: 1.52 ms Wall time: 1.52 ms Path length: 18 Reached squares: 59
%time result = ts.best_first_search(maze, strategy = "A*", W = 1000, debug = False, vis = False)
ts.show_path(maze, result)
CPU times: user 1.14 ms, sys: 54 μs, total: 1.2 ms Wall time: 1.2 ms Path length: 16 Reached squares: 60
$W<1$ tends towards Uniform-Cost Search/BFS (optimality is guaranteed)
%time result = ts.best_first_search(maze, strategy = "A*", W = .7, debug = False, vis = False)
ts.show_path(maze, result)
CPU times: user 3.63 ms, sys: 0 ns, total: 3.63 ms Wall time: 3.64 ms Path length: 16 Reached squares: 114
%time result = ts.best_first_search(maze, strategy = "A*", W = .0001, debug = False, vis = False)
ts.show_path(maze, result)
CPU times: user 6.1 ms, sys: 0 ns, total: 6.1 ms Wall time: 5.93 ms Path length: 16 Reached squares: 149
Compare Timing¶
#f = open("small_maze.txt", "r")
#f = open("medium_maze.txt", "r")
f = open("large_maze.txt", "r") # this has only one solution!
#f = open("open_maze.txt", "r")
#f = open("empty_maze.txt", "r")
#f = open("empty_maze_2.txt", "r")
#f = open("L_maze.txt", "r")
#f = open("loops_maze.txt", "r")
maze_str = f.read()
maze = parse_maze(maze_str)
ts.show_maze(maze)
ts.set_order(random=True)
Directions are checked at every step in random order.
import timeit
import math
reps = 50
times = {}
algorithms = ["BFS", "DFS", "GBFS", "A*"]
### FIXME: add IDS
for a in algorithms:
times[a] = math.ceil(timeit.timeit(stmt = f'ts.best_first_search(maze, strategy = "{a}", debug = False, vis = False)',
setup = 'from __main__ import ts, maze',
number = reps)*1e6/reps)
times['DFS(no reached)'] = math.ceil(timeit.timeit(stmt = f'ts.DFS(maze, vis = False)',
setup = 'from __main__ import ts, maze',
number = reps)*1e6/reps)
import pandas as pd
df = pd.DataFrame(times, index = ["time in micro seconds"])
df
| BFS | DFS | GBFS | A* | DFS(no reached) | |
|---|---|---|---|---|---|
| time in micro seconds | 6439 | 4092 | 9777 | 11065 | 9863 |
import matplotlib.pyplot as plt
plt.bar(df.columns, height = df.iloc[0])
plt.ylabel("run time in micro seconds")
plt.show()