Hierarchical Reinforcement Learning

Modular Multitask Reinforcement Learning with Policy Sketches [ICML 2017]

Proposes to use as supervision for hierarchical agents "policy sketches": sequences of reusable blocks of behavior without any specifics of the behaviors themselves
Interesting idea, probably more efficient than e.g. reward design

FeUdal Networks for Hierarchical Reinforcement Learning [ICML 2017]

Proposes a manager-worker architecture: manager sets goals for worker and both agents are trained with policy gradients
Assumes that manager's decisions result in a particular distribution over future states; this seems unlikely
~2000 on Montezuma's Revenge

The Predictron: End-To-End Learning and Planning [arXiv 2016]

Proposes a NN architecture capable of learning an internal MRP and outputting value estimates
Shows that the NN queries different depths on different tasks

Stochastic Neural Networks for Hierarchical Reinforcement Learning [ICLR 2017]

Uses a stochastic neural network to learn skills before task is presented(pre-training)
Trains skills by maximizing channel capacity
Solves continuous control tasks that were previously unsolved

Surprise-based Intrinsic Motivation for Deep Reinforcement Learning [ICLR 2017]

Introduces an additional reward term proportional to how unexpected the state transition was to the agent's model

Strategic Attentive Writer for Learning Macro-Actions [NIPS 2016]

Develops an algorithm that learns to plan sequences of actions in addition to their level of commitment
Only works for finite action spaces and a predetermined timeline

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation [NIPS 2016]

Introduces h-DQN, a deep network version of the options framework
Uses hardwired goals(similar to 'salient events' from Singh 2004)

The Option-Critic Architecture [NIPS Workshop 2015]

Derives policy gradient theorems for options

Intrinsically Motivated Reinforcement Learning [NIPS 2004]

Intrinsic motivation suggests that there is value in defining options independently of the task
The agent creates options to achieve 'salient events': events that are predetermined to be inherently interesting to the agent

Recent Advances in Hierarchical Reinforcement Learning [DEDS 2003]

Summarizes three approaches to hierarchical RL: options, HAMs, MAXQ

Learning Options in Reinforcement Learning [ISARA 2002]

Learns options by randomly specifying a task and making the most frequently visited state the termination condition of an option.

Between MDPs and semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning [Artificial Intelligence Journal 1999]

Defines an option as a policy with a termination and initiation condition
Proves that using options instead of only the primitive actions on an MDP is an SMDP problem
Proves that allowing interruption of options increases return
Derives and proves convergence of option Q-learning
Shows an environment where predefined options considerably shortens learning

Learning Macro-Actions in Reinforcement Learning [NIPS 1998]

Lets previous action influence the choice of action
Defines modified Q-value which is a linear combination of Q(s_t, a_t) and Q(a_t-1, a_t)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hierarchical_reinforcement_learning.md

hierarchical_reinforcement_learning.md

Hierarchical Reinforcement Learning

Modular Multitask Reinforcement Learning with Policy Sketches [ICML 2017]

FeUdal Networks for Hierarchical Reinforcement Learning [ICML 2017]

The Predictron: End-To-End Learning and Planning [arXiv 2016]

Stochastic Neural Networks for Hierarchical Reinforcement Learning [ICLR 2017]

Surprise-based Intrinsic Motivation for Deep Reinforcement Learning [ICLR 2017]

Strategic Attentive Writer for Learning Macro-Actions [NIPS 2016]

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation [NIPS 2016]

The Option-Critic Architecture [NIPS Workshop 2015]

Intrinsically Motivated Reinforcement Learning [NIPS 2004]

Recent Advances in Hierarchical Reinforcement Learning [DEDS 2003]

Learning Options in Reinforcement Learning [ISARA 2002]

Between MDPs and semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning [Artificial Intelligence Journal 1999]

Learning Macro-Actions in Reinforcement Learning [NIPS 1998]

Files

hierarchical_reinforcement_learning.md

Latest commit

History

hierarchical_reinforcement_learning.md

File metadata and controls

Hierarchical Reinforcement Learning

Modular Multitask Reinforcement Learning with Policy Sketches [ICML 2017]

FeUdal Networks for Hierarchical Reinforcement Learning [ICML 2017]

The Predictron: End-To-End Learning and Planning [arXiv 2016]

Stochastic Neural Networks for Hierarchical Reinforcement Learning [ICLR 2017]

Surprise-based Intrinsic Motivation for Deep Reinforcement Learning [ICLR 2017]

Strategic Attentive Writer for Learning Macro-Actions [NIPS 2016]

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation [NIPS 2016]

The Option-Critic Architecture [NIPS Workshop 2015]

Intrinsically Motivated Reinforcement Learning [NIPS 2004]

Recent Advances in Hierarchical Reinforcement Learning [DEDS 2003]

Learning Options in Reinforcement Learning [ISARA 2002]

Between MDPs and semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning [Artificial Intelligence Journal 1999]

Learning Macro-Actions in Reinforcement Learning [NIPS 1998]