LLM Reinforcement Learning Framework

Imagine doing everything in life without ever gaining any reward for it whatsoever. A very large portion of the foundation models out there live this tragic life, and its time to change it.

The goal here is to create a solid framework for LLM/RL's much like stable baseline (open-ai's gym) is for more the more traditional RL landscape. I hope to share largely the same values as their repo with a few additions:

Core Values

Stay familiar to pseudo from literature. (nice for implementing)
Reduce the overhead of scaling to larger models accross different machines (nice for training)
Implement the latest schemes and methods and evaluate them in various environments (nice for evaluations)
Make this a nice place for researchers in general.

Planned Environments

(ordered on ambition)

Guess the city
Math
Chess
SWE/MLE-bench
Factorio
Minecraft

Contributing

This repo is under construction. If you want to contribute please do :) If you want to share ideas on how to make it better/cleaner/leaner/simpler? feel free to contact me on linkedin/email/whatever channel you want.

Contact

📧 [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
__pycache__		__pycache__
algorithms		algorithms
common		common
configs		configs
episode_generators		episode_generators
policies		policies
relign.egg-info		relign.egg-info
runners		runners
tests		tests
utils		utils
.gitignore		.gitignore
README.md		README.md
flax_test.py		flax_test.py
requirements.txt		requirements.txt
run.py		run.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Reinforcement Learning Framework

Core Values

Planned Environments

Contributing

Contact

About

Releases

Packages

Languages

algopapi/llmrl

Folders and files

Latest commit

History

Repository files navigation

LLM Reinforcement Learning Framework

Core Values

Planned Environments

Contributing

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages