Name		Name	Last commit message	Last commit date
parent directory ..
PolicyGradients.ipynb		PolicyGradients.ipynb
readme.md		readme.md

readme.md

Exercise 12

In our last exercise we will examine the methods of policy gradient which allow us to set up control algorithms on environments with continuous state AND action space. The environment under consideration is given by the LunarLander from OpenAI's gym. This toy example is based upon the arcade game Lunar Lander by Atari and is defined by an 8-dimensional (continuous) state space and a 2-dimensional (continuous) action space. Plenty of challenges!

Tasks:

Monte-Carlo policy gradient (REINFORCE) using a Gaussian policy
Actor-Critic algorithm with TD(0) targets using a Gaussian policy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ex12

ex12

readme.md

Exercise 12

Tasks:

Files

ex12

Directory actions

More options

Directory actions

More options

Latest commit

History

ex12

Folders and files

parent directory

readme.md

Exercise 12

Tasks: