Advantage Actor Critic Algorithm
Implementation of Policy-Gradient and Actor Critic Reinforcement Learning Paradign for LunarLander Simulation of OpenAI Gym:
(1) REINFORCE Policy Gradient Algorithm
(2) Advantage Actor Critic Algorithm (A2C)
Experimental Results: