Skip to content

Latest commit

 

History

History
30 lines (20 loc) · 1.18 KB

README.md

File metadata and controls

30 lines (20 loc) · 1.18 KB

LLM Reinforcement Learning Framework

Imagine doing everything in life without ever gaining any reward for it whatsoever. A very large portion of the foundation models out there live this tragic life, and its time to change it.

The goal here is to create a solid framework for LLM/RL's much like stable baseline (open-ai's gym) is for more the more traditional RL landscape. I hope to share largely the same values as their repo with a few additions:

Core Values

  • Stay familiar to pseudo from literature. (nice for implementing)
  • Reduce the overhead of scaling to larger models accross different machines (nice for training)
  • Implement the latest schemes and methods and evaluate them in various environments (nice for evaluations)
  • Make this a nice place for researchers in general.

Planned Environments

(ordered on ambition)

  1. Guess the city
  2. Math
  3. Chess
  4. SWE/MLE-bench
  5. Factorio
  6. Minecraft

Contributing

This repo is under construction. If you want to contribute please do :) If you want to share ideas on how to make it better/cleaner/leaner/simpler? feel free to contact me on linkedin/email/whatever channel you want.

Contact

📧 [email protected]