Skip to content

algopapi/llmrl

Repository files navigation

LLM Reinforcement Learning Framework

Imagine doing everything in life without ever gaining any reward for it whatsoever. A very large portion of the foundation models out there live this tragic life, and its time to change it.

The goal here is to create a solid framework for LLM/RL's much like stable baseline (open-ai's gym) is for more the more traditional RL landscape. I hope to share largely the same values as their repo with a few additions:

Core Values

  • Stay familiar to pseudo from literature. (nice for implementing)
  • Reduce the overhead of scaling to larger models accross different machines (nice for training)
  • Implement the latest schemes and methods and evaluate them in various environments (nice for evaluations)
  • Make this a nice place for researchers in general.

Planned Environments

(ordered on ambition)

  1. Guess the city
  2. Math
  3. Chess
  4. SWE/MLE-bench
  5. Factorio
  6. Minecraft

Contributing

This repo is under construction. If you want to contribute please do :) If you want to share ideas on how to make it better/cleaner/leaner/simpler? feel free to contact me on linkedin/email/whatever channel you want.

Contact

📧 [email protected]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages