PyTorch implementation of the paper Overcoming Exploration in Reinforcement Learning with Demonstrations in surgical robot manipulation tasks.
- OpenAI Baselines for the tensorflow -based implementation.
- SurRoL for the training and testing simulation platform.
- DrQv2 for the coding structure and utils modules.
We use Python 3.8 and Anaconda3 for development. To create an environment and install dependencies, run the following steps:
# Clone and cd into herdemo
git clone https://github.com/TaoHuang13/hindsight-experience-replay-with-demo.git
cd hindsight-experience-replay-with-demo
# Create and activate environment
conda create -n herdemo python=3.8 -y
conda activate herdemo
# Install dependencies
pip install -e .
Then add one line of code in gym/gym/envs/__init__.py
to register SurRoL tasks:
import surrol.gym
Run the following command to collect expert demonstration via the scripted policy in the individual task file:
python surrol/data/data_generation.py --env env_name
Here we have already provided demonstrations of several tasks.
At a high-level, our code relies on the generic python script: train.py
for training and evaluating RL agent. We use hydra for hyperparameterize this script with experiment-specific configuration. Specifically, all experiments should be configured in the directory configs/
or command lines.
The rest of code is organized as follows:
configs/
config files for launching expriments.rl/
core implementation of HER+DEMO adopted from OpenAI Baselines.surrol/
simulation platform for surgical robotic manipulation based on PyBullet.scripts/
bash scripts to running a batch of experiments.train.py
generic python script for training and evaluating RL agent.
To simply start a experiment, run the following command:
sh scripts/run_herdemo.sh