Adversarial search

Adversarial Search
- Dheerendra

Outline
 Introduction
 Game as search problems
 Optimal decisions- Minimax Algorithm
 α - β Pruning
 Imperfect, real-time decision

Adversarial search
 Examine the problem that arise when we try to plan ahead in a world
where other agent are planning against us.
 A good example is in board game.
 Adversarial games, while much studies in AI, are a small part of game
theory in economics.

Typical AI assumptions
 Two agents whose actions alternates.
 Utility values for each agents are opposite to each other.
 Creates the adversarial situation.
 Fully observable environment.
 In game theory terms: Zero-sum games of perfect information.

Search versus Games
 Search – no adversary
 Solution is (heuristic) method of finding goal.
 Heuristic techniques can find the optimal solution.
 Evaluation function: estimate of cost from start to goal through given node.
 Example: path finding, scheduling activities.
 Game – adversary
 Solution is strategy (strategy specifies moves for every possible opponent reply.
 Optimality depends on opponents.
 Time limit force an approximation solution.
 Evaluation function: evaluate “goodness” of game position.
 Examples: Chess, Checkers, GO.

Game setup
 Two player: MAX and MIN.
 MAX moves first and they take turns until game is over.
 Winner get awarded, looser get penalty.
 Game as search:
 Initial state: e.g. board configuration of the chess.
 Successor function: list of( move, state) pairs specifying legal moves.
 Terminal test: is the game finished?
 Utility function: gives numerical values of terminal state. E.g. win(+1), lose(-1)
or draw(0) in tic-tac-toe or chess.
 MAX uses search tree to determine the next move.

Partial game tree for tic-tac-toe

Minimax strategy: look ahead and reason
backward
 Find the optimal strategy for MAX assuming an infallible MIN opponent.
 Need to compute this all the down the tree.
 Complete depth first exploration of the game tree.
 Assumption: Both players play optimally.
 Given a game tree, the optimal strategy can be determined by using the
minimax value of each node.

Practical problem with minimax search
 Number of game state is exponential in the number of moves
 Solution: do not examine every node.
=> pruning
Remove branches that do not influence the final decision.

Alpha-beta algorithm
 Depth-first search only consider nodes along a single path at any time.
α= highest value choice that we can guarantee for MAX so far in the current
subtree.
β= lowest value choice that we can guarantee for MIN so far in the current
subtree.
 Update value for α and β during search and prunes remaining branches as
soon as the value is known to be worse than the current α or β value for
MAX or MIN.

Alpha-Beta example
 Do depth-first search until first leaf
(1) (2)

Alpha-Beta example(continued)
(3) (4)

(5) (6)

(7) (8)

(9)

Effectiveness of Alpha-beta search
 Worst-case
 Branches are ordered so that no pruning takes place. In this case alpha-beta
gives no improvement over exhaustive search.
 Best-case
 Each player’s best move is the left-most alternative.
 In practice, performance is closer to best rather than worst-case.
 In practice, often get O(b(d/2)) rather than O(bd)
 This is same as having branching factor of sqrt(b)
 Since (sqrt(b))d =b(d/2)
 i.e., we have effectively gone from b to square root of b
 E.g., in chess go from b~35 to b~6
 This permits much deeper search in the same amount of time.
 Typically twice as deep.

Final comments about Alpha-beta
Pruning
 Pruning does not affect the final result.
 Entire subtree can be pruned.
 Good move ordering improves effectiveness of pruning.
 Repeated states are again possible.
 Store them in memory = transposition table.

Practical implementation
How do we make these ideas practical in real life game trees?
Standard approach:
 Cut-off test: (where do we stop descending the tree)
 Depth limit
 Better: iterative deepening
 Cut-off only when no big changes are expected to occur next (quiescence search)
 Evaluation function
 When the search is cut-off, we evaluate the current state by estimating its utility using
an evaluation function.

Static (Heuristic) Evaluation Function
 An evaluation function:
 Estimate how good the current board configuration is for a player.
 Typically, one figures out how it is for the player, and how good for the opponent, and
subtract the opponent score from the players.
 Chess : value of white pieces – value of black pieces
 Typical values from –infinity(loss) to +infinity(win) or[-1,+1].
 If the board evaluation is X for a player, it is –X for opponent.
 Many clever ideas about how to use the evaluation function.
 E.g. null move heuristic: let opponent moves twice.
 Examples
 Chess
 Checkers
 Tic-tac-toe

Iterative (Progressive) Deepening
 In real games, there is usually a time limit T on making a move.
 How do we take this into account?
 Using alpha-beta we cannot use “partial” result with any confidence unless the full
breadth of the tree has been searched.
 So, we could be conservative and set a conservative depth-limit which guarantees that
we will find a move in time < T.
 Disadvantage is that we may finish early, could do more search.
 In practice, iterative deepening search (IDS) is used
 IDS runs depth first search with an increasing depth limit.
 When the clock runs out we use the solution found at the previous depth limit.

The state of play
 Checkers
 Chinook ended 40-year reign of human world-champion Marion Tinsley in 1994.
 Chess
 Deep blue defeated human world-champion Garry Kasparov in a six-game match in
1997.
 GO
 AlphaGo developed by Alphabet Inc. ‘s Google DeepMind, beats Ke Jie, the world
No. 1 ranked Go player at the time, in 2017.
 See (e.g.) http://www.cs.ualberta.ac/~games for more information's.

Conclusion
 Game playing can be effectively modelled as a search problem.
 Game tree represent alternate computer/opponent moves.
 Evaluation functions estimate the quality of a given board configuration for
the MAX player.
 Minimax is a approach which choose move by assuming that opponent will
always choose the move which is best for them.
 Alpha-beta is a procedure which can prune large parts of the search tree and
allow search to go deeper.
 For many very well known games, computer algorithms based on heuristic
search match or out-perform the human world expert.

Adversarial search

More Related Content

What's hot (20)

Similar to Adversarial search (20)

Recently uploaded (20)

Adversarial search