TicTacToeAI
Tic-Tac-Toe bot that will never lose, implemented with AI design MiniMax.
Overview
- Built a Tic-Tac-Toe game for iPhone 4s.
- Made the bot with different implementations:
- Average score: the bot goes for the highest score available.
- MiniMax score: the bot minimizes the player's score.
- Alpha-Beta pruning: MiniMax with cut-offs to load the game faster.
- (To-Do) Rotations and reflections: Alpha-Beta pruning with even more cut-offs by using symmetry.
Creating the Bot
Photo taken from http://www.devx.com/dotnet/Article/34912
The first implementation of the bot was done by making all the possible moves from the start to the finish and adding how the board looks like to a tree.
Then, I added a variable to count just exactly how many boards are created, which turned out to be 986,410.
I checked it by deriving a formula from the pattern 1 + 9 + (9) * 8 + (9 * 8) * 7 + ... + 9!
which is 1 + Summation(9!/K!) where K = 0 to K = 8
, getting us 986,410 boards!
So, the initial loading time was pretty long because there were waaaay too many boards, but we were happy to see that at least we have a working tree.
The tree looks exactly like the picture above.
Not all the possible moves are necessary because the game can end earlier if there is a winner. So after adding a check to stop making boards when a winner is found, the amount of boards reduced to 549,945 and so did our initial loading time.
All the methods so far has been loading the entire tree at the initial load. So with our method, it has a very long initial load but after the game is loaded, the bot moves instantly.
In order to make the initial load not so displeasing, Alpha-Beta pruning technique was implemented with the MiniMax algorithm. Instead of checking for a winner like before, it checks for the scores that are given from the children tree and decides whether or not it needs to check other trees. This reduced the amount of boards at initial load to 85,097! With this method, it was a faster initial load, with the downside of having to load every bot's turn, which wasn't bad because the number of boards needed to make for future moves are a lot smaller.
This is an overestimated number of boards made with Alpha-Beta pruning at every round.
Round | # of boards |
---|---|
1 | 85,097 |
2 | ~22,000 |
3 | ~3,500 |
4 | ~1,000 |
5 | ~200 |
6 | ~70 |
7 | ~20 |
8 | ~5 |
9 | <= 1 |
Total | ~111,000 |
Here's a rough comparison between each implementation with the iPhone 4s simulator on my MacBook.
Algorithm | # of boards | Initial loading speed (seconds) | 1st round loading speed (seconds) | 2nd round loading speed (seconds) |
---|---|---|---|---|
All possible moves | 986,410 | 15 s | 0 s | 0 s |
All possible moves w/ checks for winner | 549,945 | 9 s | 0 s | 0 s |
Alpha-Beta pruning | 111,000 | 0 s | 1.5 s | 0.4 s |
Scoring the Bot
Photo taken from http://users.sussex.ac.uk/~christ/crs/kr-ist/lec05a.html
We have a tree where each depth represents a move made on the board. The scores are calculated at the leaf nodes when the game ends, returning a score of 1 if the bot wins, 0 for a tie, and -1 for a loss. This is known as the static score. Then the scores are passed to the parents until it reached the root node.
Average Score
Picture credit: Alaric
The first idea was to select the highest average scores from the children nodes to get the best possible chance of winning. The idea was that the children with the higher average score will lead to higher chance of winning. While this is true, it didn't create an unbeatable Tic-Tac-Toe. It made the bot too focused on selecting the moves that leads it to higher chances of victory. What this means is that, even if the opponent is about to win in the next turn, the bot will continue to go towards a move where it CAN lead to higher chances of winning instead of blocking. Because the bot doesn't block the opponent's moves, the bot often loses and was not unbeatable as we wanted it to be.
MiniMax Score
MiniMax is the idea of minimizing the opponent's maximum score. It goes like this:
- Build the game tree. The static scores are determined at the leaf nodes.
- Start traversing back, passing scores to each parent nodes.
- The bot node will pick the highest score from children.
- The player node will pick the smallest score from children.
So the bot will pick moves that will give the opponent the lowest score possible, thus maximizing the bot's own score. By using this method, the bot will continuously block the opponent's moves, preventing them from winning, while also going for any possible victories. As a result, using the MiniMax algorithm allowed us to create an unbeatable bot in Tic-Tac-Toe.
Alpha-Beta Pruning
Alpha-Beta pruning is a way of reducing the amount of search by not exploring the child nodes that will never be searched. This is done by having two variables with given names alpha and beta to keep track of scores for alpha-cutoff and beta-cutoff. Chris Thronton explains in his lecture:
- Applying an alpha-cutoff means we stop search of a particular branch because we see that we already have a better opportunity elsewhere.
- Applying a beta-cutoff means we stop search of a particular branch because we see that the opponent already has a better opportunity elsewhere.
So where does Alpha-Beta pruning come in? As an example, let's say it's the bot's turn and that it has two children.
Bot's turn,
root
node: Wants to pick the highest score from children. It will populate the first child node, call itA
, and get it's score. Next, it will have to find the score for the second child, call itB
.Opponent's turn, node
B
: Wants to pick the lowest score from children. It has 3 childB1, B2, B3
. It wants to iterate through each one of those to get the lowest score BUT it also knows that the biggest score that theroot
node has found is from nodeA
. It will also pass that biggest score as argument to each of the children.Bot's turn, node
B1
: Wants to pick the highest score. Re-iteration of step 1 and returns the biggest score to nodeB
.Opponent's turn, back to node
B
: Now we have the score ofB1
. If the score ofB1
is smaller than the score fromroot
node, then it doesn't have to search child nodesB2
andB3
. The nodeB
will pick a score fromB1
or smaller and theroot
node will pick a score bigger than score fromB1
. Because of that, theroot
node will end up never picking from the nodeB
.
So cutoffs happen when alpha >= beta.
Alpha is the biggest score that the root
node might pick and beta is the smallest score that the B
might pick.
Then B1, B2, B3
can sets the highest alpha.
B
compares if B1
alpha is bigger than or equal to B
beta and decides to do a cutoff if it does.
This Alpha-Beta pruning practice is a very neat tool that can help visualize how the process goes.
Acknowledgements
My mentor Rey for guiding us through the assignment. This assignment was done together with Alaric at our internship at PlanChat.