TicTacToe-Reinforcement-Learning

This project is essentially an implementation of a reinforcement learning algorithm (Q-Learning) for the game of Tic Tac Toe. The aim of the project is to train an AI agent to play the game optimally.

Here's a breakdown of how learntris_both_X_and_O_by playing_vs_each_other.py script works (the other two are just semplified versions):

Initialization: The Tic Tac Toe board is a list of 9 items (3x3 grid) where '_' represents an empty space. The script also initializes a Q-table as an empty dictionary, which will be filled as the AI learns. The wins variable contains all possible winning combinations in the game.
Q-Learning Parameters: The script sets several hyperparameters including alpha (learning rate), gamma (discount factor), epsilon (exploration rate), and epsilon_decay (rate at which epsilon decreases).
Main Loop: The AI plays a large number of games (as defined by episodes) against itself. On each move, it decides whether to make a random move (exploration) or to use the Q-table to decide the best move (exploitation). This decision is based on the value of epsilon.
Updating Q-Table: After each move, it updates the Q-value of the state-action pair using the Q-learning update rule. When a game ends, it gives a reward of +100 for a win and updates the Q-values accordingly.
Track Performance: It keeps track of the number of games won, lost, and drawn. It prints these stats after every eval_interval episodes.
Model Saving: At the end of training, the script saves the final Q-table as a JSON file named 'model_x_and_o.json'.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
learntris_as_O.py		learntris_as_O.py
learntris_as_X.py		learntris_as_X.py
learntris_both_X_and_O_by playing_vs_each_other.py		learntris_both_X_and_O_by playing_vs_each_other.py
modelO.json		modelO.json
model_x_and_o.json		model_x_and_o.json
model_x_best.json		model_x_best.json
model_x_best1.json		model_x_best1.json
play_tris_as_O.py		play_tris_as_O.py
play_tris_as_X.py		play_tris_as_X.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TicTacToe-Reinforcement-Learning

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TicTacToe-Reinforcement-Learning

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages