Trained Warfleet (Group-ID 6)

This project is part of our participation in the intelligent systems course at the HSD in Düsseldorf, Germany.
Our goal is the implementation and training of an agent, capable of competently playing the board game warfleet, in python using reinforcement learning.
To achieve this we also had to develop a feasible environment for the agent to be trained in.
For this purpose we chose the OpenAI Gym toolkit, which provides an easy-to-use suite of reinforcement learning tasks.

Current State:

The environment is ready to be used to train agents. The rules of the game and the process of playing have been implemented. Currently a basic agent, which takes random actions, is set up to play against a simple AI, which in turn also takes random actions. The agent gains a small reward for every hit and a greater reward for winning a match.
The OpenAI Baselines framework enabled us the train models using the PPO2 and A2C algorithems.

The playing field or board of our game is a 10x10 2D array of the type integer. Possible values here are 1 for water, 2 for parts of a ship and 0 for shot positions.

The action space in our environment consists of all possible coordinates in said board.

The observation space describes the amount of possible values, 3 in this case, for every board position.

Usage Instructions:

Since this project is based on OpenAi Gym it requires a python environment with the toolkit installed to function correctly. You can either set this up beforehand or simply add gym to your environment after cloning or downloading this repository. All you have to do is to run one of these files:

To train agents you can run trainAgents.py.
To test agents you can run testAgents.py.

The models are located in the folder: Warfleet_Gym_AI/trained_agents

The console prompt will ask for the algorithm and the timessteps.

Here you can see the console output of the agent's board after all ships have been placed and the firing of multiple shots from both sides/players. Once again:
0 = shot
1 = water
2 = ship

To the left we have a representation of the agent's opponent's board and below that the respective representation of the agent's board in a certain game state. As you can see the agent won by destorying all of its opponent's ships while 2 of its ships still remain partially intact.

This image shows the last state which resulted in the agent winning this match. In this case it took 295 episode steps to finish the match and the agent gained a combined reward of 42.

End of the game

At the end of the game you can see that the agent has won the game and shot every ship. The Enemy shot the agent ships by random choice. So the shots have no structure or strategy. The Agent shot every ship by its length. If a ship is hit, the next ship possition is just one field beside.

Name		Name	Last commit message	Last commit date
Latest commit History 156 Commits
.idea		.idea
Warfleet_Gym_AI		Warfleet_Gym_AI
doc		doc
images		images
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trained Warfleet (Group-ID 6)

Current State:

Usage Instructions:

End of the game

Tensorboard diagrams:

Reward

Advantage, Clip Range, Discounted Reward

Learning Rate

Loss

Future Outlook:

About

Releases

Packages

Contributors 3

Languages

MSchwarz7757/Trained_Warfleet

Folders and files

Latest commit

History

Repository files navigation

Trained Warfleet (Group-ID 6)

Current State:

Usage Instructions:

End of the game

Tensorboard diagrams:

Reward

Advantage, Clip Range, Discounted Reward

Learning Rate

Loss

Future Outlook:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages