Reinforcement Learning: Pure Python Implementations

This repository contains the materials and code implementations from my graduate-level Reinforcement Learning (RL) class. The focus is on implementing various RL algorithms purely in Python, using only basic libraries like NumPy, Matplotlib, and OpenAI Gym. The content in this repository closely follows the concepts and algorithms presented in Sutton and Barto's classic "Reinforcement Learning: An Introduction."

Project Structure

The repository is organized into folders corresponding to different topics and RL algorithms. Each folder contains Python scripts implementing the algorithms, along with relevant documentation and example problems. Here's an overview of the main sections:

Dynamic Programming: Implementations of dynamic programming methods such as policy iteration and value iteration.
Finite MDP and Bellman Equations: Solutions and examples involving finite Markov Decision Processes (MDPs) and Bellman equations.
Temporal Difference: Implementations of temporal difference learning methods like SARSA and Q-learning.
Eligibility Traces and TD(lambda): Examples and code related to eligibility traces and the TD(lambda) algorithm.
Monte Carlo Methods: Implementation of Monte Carlo learning methods, including Monte Carlo control with exploring starts.
Actor-Critic with Eligibility Traces: Code for actor-critic algorithms that utilize eligibility traces for learning.
On-policy Control with Approximation: Implementation of on-policy control methods using function approximation.
Multi-arm Bandit Problems: Solutions to various multi-arm bandit problems using different strategies like epsilon-greedy and UCB.
n-Step SARSA: Implementation of n-step SARSA, a temporal difference learning method that balances between SARSA(1) and SARSA(lambda).

Requirements

To run the code in this repository, you’ll need Python 3.x and the following Python libraries:

numpy
matplotlib
gym

You can install the required libraries using pip:

pip install numpy matplotlib gym

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Actor-Critic with Eligibility Traces		Actor-Critic with Eligibility Traces
Dynamic Programming		Dynamic Programming
Eligibility Traces and TD_lamdba		Eligibility Traces and TD_lamdba
Finite MDP and Bellman Equations		Finite MDP and Bellman Equations
Monte Carlo Methods		Monte Carlo Methods
On-policy Control with Approximation		On-policy Control with Approximation
Temporal Difference		Temporal Difference
multi_arm_bandit_problems		multi_arm_bandit_problems
nStep SARSA		nStep SARSA
LICENSE		LICENSE
README.md		README.md
RLbook.txt		RLbook.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning: Pure Python Implementations

Project Structure

Requirements

About

Releases

Packages

Languages

License

mdzahidul-islam/Reinforcement-learning-in-pure-python

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning: Pure Python Implementations

Project Structure

Requirements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages