-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathothello_paip.py
488 lines (384 loc) · 17.1 KB
/
othello_paip.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
"""
**Othello** is a turn-based two-player strategy board game. The players take
turns placing pieces--one player white and the other player black--on an 8x8
board in such a way that captures some of the opponent's pieces, with the goal
of finishing the game with more pieces of their color on the board.
Every move must capture one more more of the opponent's pieces. To capture,
player A places a piece adjacent to one of player B's pieces so that there is a
straight line (horizontal, vertical, or diagonal) of adjacent pieces that begins
with one of player A's pieces, continues with one more more of player B's
pieces, and ends with one of player A's pieces.
For example, if Black places a piece on square (5, 1), he will capture all of
Black's pieces between (5, 1) and (5, 6):
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
1 . . . . . . . . 1 . . . . . . . .
2 . . . . . . . . 2 . . . . . . . .
3 . . o @ . o . . 3 . . o @ . o . .
4 . . o o @ @ . . 4 . . o o @ @ . .
5 . o o o o @ . . 5 @ @ @ @ @ @ . .
6 . . . @ o . . . 6 . . . @ o . . .
7 . . . . . . . . 7 . . . . . . . .
8 . . . . . . . . 8 . . . . . . . .
For more more information about the game (which is also known as Reversi)
including detailed rules, see the entry on [Wikipedia][wiki]. Additionally,
this implementation doesn't take into account some tournament-style Othello
details, such as game time limits and a different indexing scheme.
We will implement representations for the board and pieces and the mechanics of
playing a game. We will then explore several game-playing strategies. There is
a simple command-line program [provided](examples/othello/othello.html) for
playing against the computer or comparing two strategies.
Written by [Daniel Connelly](http://dhconnelly.com). This implementation follows
chapter 18 of Peter Norvig's "Paradigms of Artificial Intelligence".
[wiki]: http://en.wikipedia.org/wiki/Reversi
"""
# -----------------------------------------------------------------------------
## Table of contents
# 1. [Board representation](#board)
# 2. [Playing the game](#playing)
# 3. [Strategies](#strategies)
# - [Random](#random)<br>
# - [Local maximization](#localmax)<br>
# - [Minimax search](#minimax)<br>
# - [Alpha-beta search](#alphabeta)<br>
# 4. [Conclusion](#conclusion)
# -----------------------------------------------------------------------------
# <a id="board"></a>
## Board representation
# We represent the board as a 100-element list, which includes each square on
# the board as well as the outside edge. Each consecutive sublist of ten
# elements represents a single row, and each list element stores a piece. An
# initial board contains four pieces in the center:
# ? ? ? ? ? ? ? ? ? ?
# ? . . . . . . . . ?
# ? . . . . . . . . ?
# ? . . . . . . . . ?
# ? . . . o @ . . . ?
# ? . . . @ o . . . ?
# ? . . . . . . . . ?
# ? . . . . . . . . ?
# ? . . . . . . . . ?
# ? ? ? ? ? ? ? ? ? ?
# This representation has two useful properties:
#
# 1. Square (m,n) can be accessed as `board[mn]`. This avoids the need to write
# functions that convert between square locations and list indexes.
# 2. Operations involving bounds checking are slightly simpler.
# The outside edge is marked ?, empty squares are ., black is @, and white is o.
# The black and white pieces represent the two players.
EMPTY, BLACK, WHITE, OUTER = '.', '@', 'o', '?'
PIECES = (EMPTY, BLACK, WHITE, OUTER)
PLAYERS = {BLACK: 'Black', WHITE: 'White'}
# To refer to neighbor squares we can add a direction to a square.
UP, DOWN, LEFT, RIGHT = -10, 10, -1, 1
UP_RIGHT, DOWN_RIGHT, DOWN_LEFT, UP_LEFT = -9, 11, 9, -11
DIRECTIONS = (UP, UP_RIGHT, RIGHT, DOWN_RIGHT, DOWN, DOWN_LEFT, LEFT, UP_LEFT)
def squares():
"""List all the valid squares on the board."""
return [i for i in range(11, 89) if 1 <= (i % 10) <= 8]
def initial_board():
"""Create a new board with the initial black and white positions filled."""
board = [OUTER] * 100
for i in squares():
board[i] = EMPTY
# The middle four squares should hold the initial piece positions.
board[44], board[45] = WHITE, BLACK
board[54], board[55] = BLACK, WHITE
return board
def print_board(board):
"""Get a string representation of the board."""
rep = ''
rep += ' %s\n' % ' '.join(map(str, list(range(1, 9))))
for row in range(1, 9):
begin, end = 10 * row + 1, 10 * row + 9
rep += '%d %s\n' % (row, ' '.join(board[begin:end]))
return rep
# -----------------------------------------------------------------------------
# <a id="playing"></a>
## Playing the game
# We need functions to get moves from players, check to make sure that the moves
# are legal, apply the moves to the board, and detect when the game is over.
### Checking moves
# A move must be both valid and legal: it must refer to a real square, and it
# must form a bracket with another piece of the same color with pieces of the
# opposite color in between.
def is_valid(move):
"""Is move a square on the board?"""
return isinstance(move, int) and move in squares()
def opponent(player):
"""Get player's opponent piece."""
return BLACK if player is WHITE else WHITE
def find_bracket(square, player, board, direction):
"""
Find a square that forms a bracket with `square` for `player` in the given
`direction`. Returns None if no such square exists.
"""
bracket = square + direction
if board[bracket] == player:
return None
opp = opponent(player)
while board[bracket] == opp:
bracket += direction
return None if board[bracket] in (OUTER, EMPTY) else bracket
def is_legal(move, player, board):
"""Is this a legal move for the player?"""
hasbracket = lambda direction: find_bracket(move, player, board, direction)
return board[move] == EMPTY and any(map(hasbracket, DIRECTIONS))
### Making moves
# When the player makes a move, we need to update the board and flip all the
# bracketed pieces.
def make_move(move, player, board):
"""Update the board to reflect the move by the specified player."""
board[move] = player
for d in DIRECTIONS:
make_flips(move, player, board, d)
return board
def make_flips(move, player, board, direction):
"""Flip pieces in the given direction as a result of the move by player."""
bracket = find_bracket(move, player, board, direction)
if not bracket:
return
square = move + direction
while square != bracket:
board[square] = player
square += direction
### Monitoring players
class IllegalMoveError(Exception):
def __init__(self, player, move, board):
self.player = player
self.move = move
self.board = board
def __str__(self):
return '%s cannot move to square %d' % (PLAYERS[self.player], self.move)
def legal_moves(player, board):
"""Get a list of all legal moves for player."""
return [sq for sq in squares() if is_legal(sq, player, board)]
def any_legal_move(player, board):
"""Can player make any moves?"""
return any(is_legal(sq, player, board) for sq in squares())
### Putting it all together
# Each round consists of:
#
# - Get a move from the current player.
# - Apply it to the board.
# - Switch players. If the game is over, get the final score.
def play(black_strategy, white_strategy):
"""Play a game of Othello and return the final board and score."""
board = initial_board()
player = BLACK
strategy = lambda who: black_strategy if who == BLACK else white_strategy
while player is not None:
move = get_move(strategy(player), player, board)
make_move(move, player, board)
#print(print_board(board))
player = next_player(board, player)
return board, score(BLACK, board)
def next_player(board, prev_player):
"""Which player should move next? Returns None if no legal moves exist."""
opp = opponent(prev_player)
if any_legal_move(opp, board):
return opp
elif any_legal_move(prev_player, board):
return prev_player
return None
def get_move(strategy, player, board):
"""Call strategy(player, board) to get a move."""
copy = list(board) # copy the board to prevent cheating
move = strategy(player, copy)
if not is_valid(move) or not is_legal(move, player, board):
raise IllegalMoveError(player, move, copy)
return move
def score(player, board):
"""Compute player's score (number of player's pieces minus opponent's)."""
mine, theirs = 0, 0
opp = opponent(player)
for sq in squares():
piece = board[sq]
if piece == player:
mine += 1
elif piece == opp:
theirs += 1
return mine - theirs
# -----------------------------------------------------------------------------
# <a id="strategies"></a>
## Play strategies
# <a id="random"></a>
### Random
# The easiest strategy to implement simply picks a move at random.
import random
def random_strategy(player, board):
"""A strategy that always chooses a random legal move."""
return random.choice(legal_moves(player, board))
# <a id="localmax"></a>
### Local maximization
# A more sophisticated strategy could look at every available move and evaluate
# them in some way. This consists of getting a list of legal moves, applying
# each one to a copy of the board, and choosing the move that results in the
# "best" board.
def maximizer(evaluate):
"""
Construct a strategy that chooses the best move by maximizing
evaluate(player, board) over all boards resulting from legal moves.
"""
def strategy(player, board):
def score_move(move):
return evaluate(player, make_move(move, player, list(board)))
return max(legal_moves(player, board), key=score_move)
return strategy
# One possible evaluation function is `score`. A strategy constructed with
# `maximizer(score)` will always make the move that results in the largest
# immediate gain in pieces.
# A more advanced evaluation function might consider the relative worth of each
# square on the board and weight the score by the value of the pieces held by
# each player. Since corners and (most) edge squares are very valuable, we
# could weight those more heavily, and add negative weights to the squares that,
# if acquired, could lead to the opponent capturing the corners or edges.
SQUARE_WEIGHTS = [
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 120, -20, 20, 5, 5, 20, -20, 120, 0,
0, -20, -40, -5, -5, -5, -5, -40, -20, 0,
0, 20, -5, 15, 3, 3, 15, -5, 20, 0,
0, 5, -5, 3, 3, 3, 3, -5, 5, 0,
0, 5, -5, 3, 3, 3, 3, -5, 5, 0,
0, 20, -5, 15, 3, 3, 15, -5, 20, 0,
0, -20, -40, -5, -5, -5, -5, -40, -20, 0,
0, 120, -20, 20, 5, 5, 20, -20, 120, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
]
# A strategy constructed as `maximizer(weighted_score)`, then, will always
# return the move that results in the largest immediate *weighted* gain in
# pieces.
def weighted_score(player, board):
"""
Compute the difference between the sum of the weights of player's
squares and the sum of the weights of opponent's squares.
"""
opp = opponent(player)
total = 0
for sq in squares():
if board[sq] == player:
total += SQUARE_WEIGHTS[sq]
elif board[sq] == opp:
total -= SQUARE_WEIGHTS[sq]
return total
# <a id="minimax"></a>
### Minimax search
# The maximizer strategies are very short-sighted, and a player who can consider
# the implications of a move several turns in advance could have a significant
# advantage. The **minimax** algorithm does just that.
def minimax(player, board, depth, evaluate):
"""
Find the best legal move for player, searching to the specified depth.
Returns a tuple (move, min_score), where min_score is the guaranteed minimum
score achievable for player if the move is made.
"""
# We define the value of a board to be the opposite of its value to our
# opponent, computed by recursively applying `minimax` for our opponent.
def value(board):
return -minimax(opponent(player), board, depth - 1, evaluate)[0]
# When depth is zero, don't examine possible moves--just determine the value
# of this board to the player.
if depth == 0:
return evaluate(player, board), None
# We want to evaluate all the legal moves by considering their implications
# `depth` turns in advance. First, find all the legal moves.
moves = legal_moves(player, board)
# If player has no legal moves, then either:
if not moves:
# the game is over, so the best achievable score is victory or defeat;
if not any_legal_move(opponent(player), board):
return final_value(player, board), None
# or we have to pass this turn, so just find the value of this board.
return value(board), None
# When there are multiple legal moves available, choose the best one by
# maximizing the value of the resulting boards.
return max((value(make_move(m, player, list(board))), m) for m in moves)
# Values for endgame boards are big constants.
MAX_VALUE = sum(map(abs, SQUARE_WEIGHTS))
MIN_VALUE = -MAX_VALUE
def final_value(player, board):
"""The game is over--find the value of this board to player."""
diff = score(player, board)
if diff < 0:
return MIN_VALUE
elif diff > 0:
return MAX_VALUE
return diff
def minimax_searcher(depth, evaluate):
"""
Construct a strategy that uses `minimax` with the specified leaf board
evaluation function.
"""
def strategy(player, board):
return minimax(player, board, depth, evaluate)[1]
return strategy
# <a id="alphabeta"></a>
### Alpha-Beta search
# Minimax is very effective, but it does too much work: it evaluates many search
# trees that should be ignored.
# Consider what happens when minimax is evaluating two moves, M1 and M2, on one
# level of a search tree. Suppose minimax determines that M1 can result in a
# score of S. While evaluating M2, if minimax finds a move in its subtree that
# could result in a better score than S, the algorithm should immediately quit
# evaluating M2: the opponent will force us to play M1 to avoid the higher score
# resulting from M1, so we shouldn't waste time determining just how much better
# M2 is than M1.
# We need to keep track of two values:
#
# - alpha: the maximum score achievable by any of the moves we have encountered.
# - beta: the score that the opponent can keep us under by playing other moves.
#
# When the algorithm begins, alpha is the smallest value and beta is the largest
# value. During evaluation, if we find a move that causes `alpha >= beta`, then
# we can quit searching this subtree since the opponent can prevent us from
# playing it.
def alphabeta(player, board, alpha, beta, depth, evaluate):
"""
Find the best legal move for player, searching to the specified depth. Like
minimax, but uses the bounds alpha and beta to prune branches.
"""
if depth == 0:
return evaluate(player, board), None
def value(board, alpha, beta):
# Like in `minimax`, the value of a board is the opposite of its value
# to the opponent. We pass in `-beta` and `-alpha` as the alpha and
# beta values, respectively, for the opponent, since `alpha` represents
# the best score we know we can achieve and is therefore the worst score
# achievable by the opponent. Similarly, `beta` is the worst score that
# our opponent can hold us to, so it is the best score that they can
# achieve.
return -alphabeta(opponent(player), board, -beta, -alpha, depth - 1, evaluate)[0]
moves = legal_moves(player, board)
if not moves:
if not any_legal_move(opponent(player), board):
return final_value(player, board), None
return value(board, alpha, beta), None
best_move = moves[0]
for move in moves:
if alpha >= beta:
# If one of the legal moves leads to a better score than beta, then
# the opponent will avoid this branch, so we can quit looking.
break
val = value(make_move(move, player, list(board)), alpha, beta)
if val > alpha:
# If one of the moves leads to a better score than the current best
# achievable score, then replace it with this one.
alpha = val
best_move = move
return alpha, best_move
def alphabeta_searcher(depth, evaluate):
def strategy(player, board):
return alphabeta(player, board, MIN_VALUE, MAX_VALUE, depth, evaluate)[1]
return strategy
# -----------------------------------------------------------------------------
# <a id="conclusion"></a>
## Conclusion
# The strategies we've discussed are very general and are applicable to a broad
# range of strategy games, such as Chess, Checkers, and Go. More advanced
# strategies for Othello exist that apply various gameplay heuristics; some of
# these are discussed in "Paradigms of Artificial Intelligence Programming" by
# Peter Norvig.
#
# See Wikipedia for more details on [minimax][mm] and [alpha-beta][ab] search.
#
# [mm]: http://en.wikipedia.org/wiki/Minimax
# [ab]: http://en.wikipedia.org/wiki/Alpha-beta_pruning