EvoPrompt for Reinforcement Learning Architectures

Using LLM's to improve their underlying architecture through an Evolutionary Algorithm is, besides poetically satisfying, also proven to be quite effective.

EvoPrompting on Reinforcement Learning Architectures aims to create novel reinforcement learning (RL) algorithms by incorporating an evolutionary algorithm with a large language model (LLM) as a crossover operator.

This project is based on the paper: EvoPrompting: Language Models for Code-Level Neural Architecture Search by Angelica Chen, David M. Dohan, and David R. So.

The authors successfully demonstrate the application of an evolutionary algorithm with an LLM to create novel machine learning architectures for the MNIST problem. In this project, we attempt to replicate their success for reinforcement learning algorithms.

Overview

Replace task T and dataset D with task T and environment E (CartPole-v1).
Use minimal implementations of popular algorithms as seed code (AC2, ACER, DQN, PPO, REINFORCE).
Adapt the LLM tuning approach due to limited access to large parameter embeddings.

Project Structure

Seed Implementations

We begin with minimal implementations of fan-favorite reinforcement learning algorithms:

AC2 - Advantage Actor-Critic
ACER - Actor-Critic with Experience Replay
DQN - Deep Q-Network
PPO - Proximal Policy Optimization
REINFORCE - REINFORCE Algorithm

These seed implementations will serve as the starting point for the evolutionary algorithm.

Environment

The chosen environment for this project is CartPole-v1 from the OpenAI Gym.

LLM Tuning

Due to the lack of access to the embeddings of a 65-billion parameter LLM, we are currently working on an alternative method for fine-tuning GPT-3.

Results

Results and insights gained from the evolutionary algorithm will be uploaded as soon as they are available.

Acknowledgements

Special thanks to the authors of the original paper, Angelica Chen, David M. Dohan, and David R. So, for their work and inspiration.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
__pycache__		__pycache__
seeds		seeds
.gitignore		.gitignore
README.md		README.md
evoRL.py		evoRL.py
prompttune.py		prompttune.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EvoPrompt for Reinforcement Learning Architectures

Overview

Project Structure

Seed Implementations

Environment

LLM Tuning

Results

Acknowledgements

About

Releases

Packages

Languages

algopapi/EvoPrompting_Reinforcement_learning

Folders and files

Latest commit

History

Repository files navigation

EvoPrompt for Reinforcement Learning Architectures

Overview

Project Structure

Seed Implementations

Environment

LLM Tuning

Results

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages