A pair of reinforcement learning agents that can play tennis 🎾
Languages: Python 3.6 and Pytorch
Environment: Unity ML-Agents Toolkit
Before Training: Reward: 0.0, 0.0, 0.0, 0.1, 0.0
Actions are in the form of a continuous vector space of size 2, corresponding to horizontal and vertical movement.
Each agent receives an environment state of 8 dimensions, position and velocity of the ball and racket. They are stacked in 3 frames for each time step, giving a total vector of size 24.
Each agent is given a reward of +0.1 whenever it hits the ball over the net, accumulated throughout the episode. When either agent lets the ball hit the ground or hits it out of bounds, it received a reward of -0.01. The environment is solved when the average maximum reward accumulated by either agent over 100 consecutive episodes reaches +0.5.
- Install Anaconda if you don't have it already.
- Open Anaconda Prompt/command line/terminal
- Create a new environment (named tennis-env):
conda create --name tennis-env python=3.6
- Activate environment:
activate tennis-env
- Navigate to desired directory to download project file:
cd path/to/desired/directory
- Clone the repository:
git clone https://github.com/albertlai431/tennis-ai
- Go to dependencies directory:
cd tennis-ai/python
- Install dependencies (may take a while):
pip install .
- Install pytorch 0.4.0 with conda:
conda install pytorch=0.4.0 -c pytorch
- Create kernel with environment:
python -m ipykernel install --user --name tennis-env --display-name "tennis-env"
- Launch jupyter-notebook and navigate to cloned repository directory
- Open
train.ipynb
and run the code if you would like to train the agent 💪 - Open
test.ipynb
and run the code if you would like to observe a fully trained agent! 😃 - Important: Before running any code in either of the ipynb files, click Kernel on the top bar, Change kernel > tennis-env
- Remember to deactivate the environment in the Anaconda Prompt/command line/terminal after you are done:
conda deactivate
- The folder
Tennis_Windows_x86_64
may not always work; if you are getting aUnityTimeOutException
, please go to this link and replaceTennis_Windows_x86_64
with the correct folder for your system. You may also need to modify theenv
declaration.