In this project, a double-jointed arm can move to target locations. A reward of +0.1 is provided for each step that the agent's hand is in the goal location. Thus, the goal of your agent is to maintain its position at the target location for as many time steps as possible.
The environment is based on Unity ML-agents. The project environment provided by Udacity is similar to the Reacher environment on the Unity ML-Agents GitHub page.
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source Unity plugin that enables games and
simulations to serve as environments for training intelligent agents. Agents can be trained using reinforcement
learning, imitation learning, neuroevolution, or other machine learning methods through a simple-to-use Python API.
The observation space consists of 33 variables corresponding to position, rotation, velocity, and angular velocities of the arm. Each action is a vector with four numbers, corresponding to torque applicable to two joints. Every entry in the action vector should be a number between -1 and 1.
Option 1: Solve the First Version
- The task is episodic, and in order to solve the environment, your agent must get an average score of +30 over 100 consecutive episodes.
Option 2: Solve the Second Version
- The barrier for solving the second version of the environment is slightly different, to take into account the presence of many agents. In particular, your agents must get an average score of +30 (over 100 consecutive episodes, and over all agents). Specifically,
After each episode, we add up the rewards that each agent received (without discounting), to get a score for each agent. This yields 20 (potentially different) scores. We then take the average of these 20 scores. This yields an average score for each episode (where the average is over all 20 agents).
In my implementation, I have chosen to solve the second version of the environment using the DDPG algorithm.
You first need to configure a Python 3.6 / PyTorch 0.4.0 environment with the needed requirements as described in the Udacity repository
Of course you have to clone this project and have it accessible in your Python environment
Then you have to install the Unity environment as described in the Getting Started section (The Unity ML-agant environment is already configured by Udacity)
Download the environment from one of the links below. You need only select the environment that matches your operating system:
Version 2: 20 Agents
-
Linux: click here
-
Mac OSX: click here
-
Windows (32-bit): click here
-
Windows (64-bit): click here
(For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.
(For AWS) If you'd like to train the agent on AWS (and have not enabled a virtual screen, then please use this link to obtain the environment.
Finally, unzip the environment archive in the 'project's environment' directory and eventually adjust the path to the UnityEnvironment in the code.
Execute the provided notebook after building your own local environment and make necessary adjustments for the path to the UnityEnvironment in the code.