Assignment 4 tips

Evaluation function (and ab-minimax) conventions (important clarifications!!!)
More subtle issues for evaluation functions
The alpha-beta minimax algorithm
The create-ab-minimax-player function

whuang@cs.rpi.edu; email
Last updates: October 12; October 11; October 7, 2000

Evaluation function (and ab-minimax) conventions

In order to write an evaluation function for Connect 4, you need to know whether the MAX player is playing X or O. For more sophisticated evaluation functions, you also need to know whose turn it is when you evaluate the state.

The assignment handout is vague regarding exactly what the player argument is. You might have made one of two assumptions: that the player argument is always the symbol for MAX, or that it is the symbol of the "current" player (consequently, you would negate the value returned to min-player).

To clarify this and to avoid any more confusion than is necessary, I am changing the number of arguments that your evaluation function should take. Your eval-fn should be of the form:

  (eval-fn board max-player player-up)

For example, if MAX is playing O, then the ab-max-player function would make the call (eval-fn board 'O 'O) and the ab-min-player function would make the call (eval-fn board 'O 'X).

Generally speaking, if the symbol is passed as an argument player, ab-max-player should make the call

  (eval-fn board player player)

and ab-min-player should make the call

  (eval-fn board (other-player player) player)

These calls occur only when the end of the game has ended or when the depth cutoff has been reached. The above convention might be a little confusing, so let me emphasize two principles and give an example:

The evaluation function should always return consistent values: large values are good for MAX, small values are good for MIN.
Your ab-minimax procedures should not alter the values returned by the evaluation function. (For example, ab-min-player should not negate the value returned by either the eval-fn or ab-max-player functions.)

To be concrete, if MAX is playing O, then a call from ab-max-player to the evaluation function will be (eval-fn board 'O 'O) and should return a high value if the board is good for player O and a low value if the board is bad for player O. A call from ab-min-player to the evaluation function will be (eval-fn board 'O 'X) and should return a low value if the board is good for player X and a high value if the board is bad for player X. "Bad for O" and "Good for X" (and vice versa) are assumed to be basically the same thing.

More subtle issues for evaluation functions

If you devise a method of evaluating a board for the current player, then you can just evaluate the board for max-player.

Here's a simple example:

(define (eval-fn board max-player player-up)
  (eval-fn-helper board max-player))

(define (eval-fn-helper board current-player)
  (let ((winner (c4-end? board))
        ; the number of open columns of three for current-player
        (player3c (second (open-columns board 
                                        current-player)))
        ; the number of open columns of three for opponent
        (opponent3c (second (open-columns board 
                                          (other-player current-player)))))
    (+ player3c 
       (- opponent3c)
       (cond ((equal? winner current-player) 100)
             ((equal? winner (other-player current-player)) -100)
             (else 0)))))

The above helper function doesn't care whose turn it is: (eval-fn-helper board 'X) is -1 times (eval-fn-helper board 'O). For more advanced evaluation functions, though, you'd want to take this into account. One simple way of doing this is to use the following sort of function to call the helper function:

(define (eval-fn board max-player player-up)
  (let ((val (eval-fn-helper board player-up)))
    (if (equal? max-player player-up)
        val                             ; return the value to ab-max-player
        (- val))))                      ; negate the value for ab-min-player

With this approach, the helper function evaluates the board for the current player and can assume that that player gets to move next. The result is negated in the main function if ab-min-player made the call.

The alpha-beta minimax algorithm

Here's the alpha-beta minimax algorithm from our text with the changes we made in class in bold. (The > signs on the second line of the "for" loops are supposed to be in bold, but they don't come out that way on my browser.)

MAX-PLAYER(state, alpha, beta)

if cutoff-test(state) then return eval-fn(state)
for each child C of state do
- alpha = max(alpha, MIN-PLAYER(C, alpha, beta)
- if alpha > beta then return alpha
return alpha

MIN-PLAYER(state, alpha, beta)

if cutoff-test(state) then return eval-fn(state)
for each child C of state do
- beta = min(beta, MAX-PLAYER(C, alpha, beta)
- if alpha > beta then return beta
return beta

The reason we made these changes was so, for example, max-player can tell from the values of its children (as evaluated by min-player) which one(s) correspond to the best move.

The above procedures return the value of the game tree, but ultimately your minimax-player must return a move (an integer between 1 and 7 inclusive), and this is the basic challenge in Problem 4b.

If there multiple optimal moves, you can pick any one. However, I do not recommend keeping track of all the optimal moves. It's not too difficult, but the "bookkeeping" involved will make your functions more complicated than they need to be.

I recommend simply returning the first best move you find. If other equally good moves present themselves, just ignore them. If you are sure that you are doing this properly, you can replace the > in the 2a lines with >= and save a few node evaluations. If you are not taking the first best move you find, then you must

The `create-ab-minimax-player` function

Your create-ab-minimax-player function will serve only as a front end. It needs to take a board and player and return a move for that player. (Preferrably a good move!)

The reason for structuring it this way is so that you can use any node representation that you want. It also allows me to use a different evaluation function and to control the depth cutoff during testing.

Here's the create-ab-minimax-player function that I use. You can just use this function or create your own. Note that my nodes consist only of the board state.

(define (create-ab-minimax-player eval-fn depth-cutoff)
  (define (minimax-player board player)
    (ab-minimax board
                player
                eval-fn
                depth-cutoff))
  minimax-player)

Assignment 4 tips

Evaluation function (and ab-minimax) conventions

More subtle issues for evaluation functions

The alpha-beta minimax algorithm

The create-ab-minimax-player function

The `create-ab-minimax-player` function