diff --git a/README.md b/README.md
index 215601c..2f6389a 100644
--- a/README.md
+++ b/README.md
@@ -31,13 +31,19 @@
Installation
- Usage
+
+ Usage
+
+
Team
License
-### About
+## About
-----
@@ -47,13 +53,13 @@ Our journey begins with Connect Four, serving as a testing ground and proof of c
This endeavor not only highlights Cogito NTNU's commitment to pushing the boundaries of AI and machine learning but also underscores our passion for blending technology with traditional games, revealing new dimensions of play and strategy. Join us as we explore the frontiers of artificial intelligence, one game at a time.
-### Getting Started
+## Getting Started
This section provides a comprehensive guide to setting up and running the project. By following the steps outlined below, you'll be prepared to embark on a journey of deep reinforcement learning with board games, starting with Connect Four and progressing to Chess.
-----
-#### Prerequisites
+### Prerequisites
Before you begin, ensure that your system meets the following requirements:
@@ -61,7 +67,7 @@ Before you begin, ensure that your system meets the following requirements:
- **Python Libraries**: Essential libraries such as NumPy and PyTorch are necessary. These are listed in the `requirements.txt` file for easy installation.
- **Hardware**: For optimal performance, an NVIDIA GPU with CUDA installed is recommended. Running deep reinforcement learning models, especially for complex games like chess, is computationally intensive and may not be feasible on CPU alone.
-#### Installation
+### Installation
1. **Clone the Project**: Begin by cloning the repository to your local machine or development environment.
@@ -75,15 +81,71 @@ Before you begin, ensure that your system meets the following requirements:
pip install -r requirements.txt
```
-#### Usage
+## Usage
+
+### Running main with Command Line Flags
+
+main.py is configured to handle different gameplay and training scenarios for the AlphaZero implementation via command line flags. Below are the available flags and their descriptions:
+
+`--test_overfit`: Test overfitting on the Connect Four game model.
+
+`--train_tic_tac_toe`: Train the AlphaZero model on Tic Tac Toe.
+
+`--train_connect_four`: Execute extended training sessions for the Connect Four game.
+
+`--self_play_ttt`: Run self-play simulations on the Tic Tac Toe model.
+
+`--self_play_c4`: Run self-play simulations on the Connect Four model.
+
+`--play_ttt`: Play against the AlphaZero model in Tic Tac Toe.
+
+`--play_c4`: Play against the AlphaZero model in Connect Four.
+
+`--first` or `-f` : The first player is the human player.
+
+`--mcts` or `-m` : Alphazero plays against MCTS instead of human player.
+
+### Examples of Usage
+
+Here are examples of how to run main.py with the available flags from a Bash shell:
+
+``` bash
+# Test overfitting on Connect Four
+python main.py --test_overfit
+
+# Train AlphaZero on Tic Tac Toe
+python main.py --train_ttt
+
+# Train AlphaZero on Connect Four
+python main.py --train_c4
+
+# Run self-play on Tic Tac Toe
+python main.py --self_play_ttt
+
+# Run self-play on Connect Four
+python main.py --self_play_c4
+
+# Play as player2 against AlphaZero on Tic Tac Toe
+python main.py --play_ttt
+
+# Play as player2 against AlphaZero on Connect Four
+python main.py --play_c4
+
+# Play as player1 against AlphaZero on Connect Four
+python main.py --play_c4 -f
+
+# Let Alphazero play as player1 against MCTS
+python main.py --play_c4 -m
-As the project is currently under development, specific usage instructions are pending. Once the project reaches a runnable state, detailed steps on how to initiate training sessions, as well as how to utilize the AI for playing Connect Four and Chess, will be provided here. Stay tuned for updates on how to leverage this AI to challenge the strategic depths of these classic board games.
+# Let AlphaZero play as player2 against MCTS
+python main.py --play_c4 -f -m
+```
-----
By adhering to the above guidelines, you'll be well-prepared to contribute to or experiment with this cutting-edge exploration into deep reinforcement learning for board games. Whether you're a developer, a researcher, or an enthusiast, your journey into AI and strategic gameplay starts here.
-### Team
+## Team
-----
diff --git a/main.py b/main.py
index de8f214..833f6b3 100644
--- a/main.py
+++ b/main.py
@@ -9,12 +9,6 @@
from src.play.play_vs_alphazero import main as play_vs_alphazero
from src.utils.game_context import GameContext
-
-### Idea, make each game generation a longer task.
-# Instead of running one function per game generation, run a function that generates multiple games.
-# This will make the overhead of creating a new multiprocessing process less significant.
-
-
def test_overfit(context: GameContext):
mp.set_start_method('spawn')
@@ -68,21 +62,21 @@ def play(context: GameContext, first: bool, mcts: bool = False):
mcts=mcts
)
-overfit_path = "./models/connect_four/overfit_nn"
+overfit_path = "./models/overfit/connect4_nn"
overfit_context = GameContext(
game_name="connect_four",
nn=NeuralNetworkConnectFour().load(overfit_path),
- save_path="./models/overfit_waste"
+ save_path="./models/overfit/connect4_overfit_waste"
)
-tic_tac_toe_path = "./models/test_nn"
+tic_tac_toe_path = "./models/tic_tac_toe/good_nn"
tic_tac_toe_context = GameContext(
game_name="tic_tac_toe",
nn=NeuralNetwork().load(tic_tac_toe_path),
save_path=tic_tac_toe_path
)
-connect4_path = "./models/connect_four/initial_test"
+connect4_path = "./models/connect_four/good_nn"
connect4_context = GameContext(
game_name="connect_four",
nn=NeuralNetworkConnectFour().load(connect4_path),
@@ -101,8 +95,8 @@ def str2bool(v: str) -> bool:
parser: ArgumentParser = ArgumentParser(description='Control the execution of the AlphaZero game playing system.')
parser.add_argument('--test_overfit', action='store_true', help='Test overfitting on Connect Four.')
-parser.add_argument('--train_tic_tac_toe', action='store_true', help='Train AlphaZero on Tic Tac Toe.')
-parser.add_argument('--train_connect_four', action='store_true', help='Train AlphaZero on Connect Four for a long time.')
+parser.add_argument('--train_ttt', action='store_true', help='Train AlphaZero on Tic Tac Toe.')
+parser.add_argument('--train_c4', action='store_true', help='Train AlphaZero on Connect Four for a long time.')
parser.add_argument('--self_play_ttt', action='store_true', help='Run self-play on Tic Tac Toe.')
parser.add_argument('--self_play_c4', action='store_true', help='Run self-play on Connect Four.')
@@ -122,10 +116,10 @@ def str2bool(v: str) -> bool:
if args.test_overfit:
test_overfit(overfit_context)
- if args.train_tic_tac_toe:
+ if args.train_ttt:
train_tic_tac_toe(tic_tac_toe_context)
- if args.train_connect_four:
+ if args.train_c4:
train_connect_four(connect4_context)
if args.self_play_ttt:
@@ -135,7 +129,7 @@ def str2bool(v: str) -> bool:
self_play(connect4_context)
if args.play_ttt:
- play(tic_tac_toe_context, first=args.first)
+ play(tic_tac_toe_context, first=args.first, mcts=args.mcts)
if args.play_c4:
play(connect4_context, first=args.first, mcts=args.mcts)