Author: Damien Hood Supervisor: Junfeng Wen
Carleton University
Reinforcement Learning (RL) and Generative Models are two seminal fields in Artificial Intelligence (AI). RL focuses on agent-environment interactions to maximize a given reward, whereas Generative Models, especially diffusion models, can generate new data resembling the training data. The intersection of these two fields is ripe for exploration. Specifically, while the Diffusion-QL algorithm exists for RL in continuous action spaces, no known models efficiently adapt it for discrete action spaces to the best of our knowledge.
Note: This work is conducted as part of an Honours Thesis at Carleton University.
This thesis aims to fill this gap by developing a variant of Diffusion-QL algorithm for discrete action spaces, termed Discrete Diffusion Q-Learning (D2QL). We evaluate its performance against traditional RL algorithms in various discrete environments.
- Develop D2QL: Adapt the Diffusion-QL algorithm for discrete action spaces by modifying the underlying diffusion model and loss function.
- Test D2QL: Validate the algorithm in discrete environments starting with a gridworld benchmark.
- Evaluate Performance: Conduct experiments to evaluate and compare the D2QL algorithm with existing RL algorithms in these environments.
-
d2ql/
:- Contains all materials related to the Discrete Diffusion Q Learning algorithm.
- Includes source code, experiments, and utility scripts.
-
literature-review/
:- Contains papers, summaries, and any related work that informs the research.
-
baseline-algorithms/
:- Stores implementations of baseline algorithms for comparative analysis.
-
environments/
:- Contains custom and/or modified reinforcement learning environments used in the research.
-
data/
:- A place for experimental data, both raw and processed.
-
results/
:- Holds generated results, figures, and tables from experiments.
-
documentation/
:- Comprehensive documentation directory.
- Includes details about the code and drafts of the written thesis.
-
misc/
:- A catch-all folder for supplementary materials.
Setup instructions will go here
Code examples, API details, etc will be go here
This project is licensed under the MIT License - see the LICENSE.md file for details.