hierarchical-skill-acquisition

Implementation of the Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement Learning by Tianmin Shu, Caiming Xiong, and Richard Socher

Paper Overview

The paper proposes a new method for solving multi-task environments [1]. Authors introduce another hierarchical approach and compare it to other methods such as "Flat" policy and H-DRLN.

Key ideas:

Hierarchical Design
Interpretable Policies
Curriculum Learning

Architecture

The picture above represents the proposed architecture. This architecture can be summarized in one sentence: At any given moment of time t, it decides whether to use one of the already trained policies for a chosen sub-task or to act on its own (low-level actions).

All the way down to LSTM, we encode current state. Then we must decide on several things:

What sub-task policy should we use? (Instruction Policy, here comes the interpretability property)
Should we use a chosen sub-task policy? (Switch Policy)
If we do not use the sub-task policy, what should we do? (Augmented Policy)

If we decided to switch to the sub-task policy, we use Base Policy module. It represents the same architecture described above thus we can go deeper and deeper, infinity and beyond.

The policy is optimized using Advantage-Actor Critic, why not the A3C? - Authors left it as a possible future work.

Training Process

To make this architecture work, we need to manually specify the order of the tasks and pre-train the policy at the zero-level. Particularly, authors work with this curriculum: "Find object" -> "Get object" -> "Put object" -> "Stack object".

"Find object" is the zero-level policy hence it must be pre-trained before moving to the next level task ("Get object").

Information Reference:

Multi-task environment - an environment where the main goal of the agent is to find a trajectory to solve a problem that consists of another smaller problems, e.g. to solve the instruction "Get object", the agent must be able to solve "Find object".

Milestones:

1. Set up the environment

Define the training environment
Define the testing environment
Implement blocks/agent random placement for the training environment
Implement blocks/agent random placement for the testing environment
Define the curriculum

2. Build RL models

3. Train the agent

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
agent		agent
utils		utils
README.md		README.md
env.py		env.py
experience.py		experience.py
test_env.xml		test_env.xml
train_env.xml		train_env.xml
train_terminal.py		train_terminal.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hierarchical-skill-acquisition

Paper Overview

Key ideas:

Architecture

Training Process

Information Reference:

Milestones:

About

Releases

Packages

Languages

vkurenkov/hierarchical-skill-acquisition

Folders and files

Latest commit

History

Repository files navigation

hierarchical-skill-acquisition

Paper Overview

Key ideas:

Architecture

Training Process

Information Reference:

Milestones:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages