Deep Q Network based Project Submission for Udacity Deep Reinforcement Learning Nanodegree By Sayon Palit
For this project, we will train an agent to navigate (and collect yellow bananas!) in a large, square world.
A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided for collecting a blue banana. Thus, the goal of your agent is to collect as many yellow bananas as possible while avoiding blue bananas.
The state space has 37 dimensions and contains the agent's velocity, along with ray-based perception of objects around the agent's forward direction. Given this information, the agent has to learn how to best select actions. Four discrete actions are available, corresponding to:
0
- move forward.1
- move backward.2
- turn left.3
- turn right.
The task is episodic, and in order to solve the environment, your agent must get an average score of +13 over 100 consecutive episodes.
To run the codes, follow the next steps:
- Create a new environment:
- Linux or Mac:
conda create --name dqn python=3.6 source activate dqn
- Windows:
conda create --name dqn python=3.6 activate dqn
- Perform a minimal install of OpenAI gym
- If using Windows,
- download swig for windows and add it the PATH of windows
- install Microsoft Visual C++ Build Tools
- then run these commands
pip install gym
- If using Windows,
- Install the dependencies under the folder python/
cd python
pip install .
- Create an IPython kernel for the
dqn
environment
python -m ipykernel install --user --name dqn --display-name "dqn"
-
Download the Unity Environment specific to your operating system
-
Start jupyter notebook from the root of this python codes
jupyter notebook
- Once started, change the kernel through the menu
Kernel
>Change kernel
>dqn
- If necessary, inside the ipynb files, change the path to the unity environment appropriately