Setting up a Custom Instance on Microsoft Azure for Training (works with the ML-Agents toolkit v0.3)
This page contains instructions for setting up a custom Virtual Machine on Microsoft Azure so you can running ML-Agents training in the cloud.
-
Start by deploying an Azure VM with Ubuntu Linux (tests were done with 16.04 LTS). To use GPU support, use a N-Series VM.
-
SSH into your VM.
-
Start with the following commands to install the Nvidia driver:
wget http://us.download.nvidia.com/tesla/375.66/nvidia-diag-driver-local-repo-ubuntu1604_375.66-1_amd64.deb sudo dpkg -i nvidia-diag-driver-local-repo-ubuntu1604_375.66-1_amd64.deb sudo apt-get update sudo apt-get install cuda-drivers sudo reboot
-
After a minute you should be able to reconnect to your VM and install the CUDA toolkit:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb sudo apt-get update sudo apt-get install cuda-8-0
-
You'll next need to download cuDNN from the Nvidia developer site. This requires a registered account.
-
Navigate to http://developer.nvidia.com and create an account and verify it.
-
Download (to your own computer) cuDNN from this url.
-
Copy the deb package to your VM:
scp libcudnn6_6.0.21-1+cuda8.0_amd64.deb <VMUserName>@<VMIPAddress>:libcudnn6_6.0.21-1+cuda8.0_amd64.deb
-
SSH back to your VM and execute the following:
sudo dpkg -i libcudnn6_6.0.21-1+cuda8.0_amd64.deb export LD_LIBRARY_PATH=/usr/local/cuda/lib64/:/usr/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH . ~/.profile sudo reboot
-
After a minute, you should be able to SSH back into your VM. After doing so, run the following:
sudo apt install python-pip sudo apt install python3-pip
-
At this point, you need to install TensorFlow. The version you install should be tied to if you are using GPU to train:
pip3 install tensorflow-gpu==1.4.0 keras==2.0.6
Or CPU to train:
pip3 install tensorflow==1.4.0 keras==2.0.6
-
You'll then need to install additional dependencies:
pip3 install pillow pip3 install numpy pip3 install docopt
-
You can now return to the main Azure instruction page.