Skip to content

Latest commit

 

History

History
2 lines (2 loc) · 379 Bytes

README.md

File metadata and controls

2 lines (2 loc) · 379 Bytes

CS885-RL

This repository is for the Reinforcement Learning course CS885 taught by Prof. Pascal Poupart at the University of Waterloo. It covers planning by dynamic programming (value iteration, policy iteration, and modified policy iteration), Q-learning, three bandit algorithms (epsilon-greedy, Thompson sampling, and UCB), REINFORCE, and model-based reinforcement learning.