Releases · clp-research/clembench · GitHub

16 Feb 14:09

phisad

v0.3 Latest

Latest

What's Changed

Framework

Change: Now clem automatically assumes self-play when only a single model is given (and the game is multi-player) by @phisad
Change: removed the option to run all games (as this simplifies the code and is now done in a pipeline script) by @phisad
Change: Re-structured results folder (now grouped by model pairing instead of games) by @phisad in #33
Added re-prompting hooks for DialogueGameMaster by @lpfennigschmidt in #42
Add option to specify filename for instances in GameInstanceGenerator by @AnneBeyer in #44

CLI

Change: Re-structured CLI calls (now have -g and -m option) by @phisad @briemadu in #35
Add CLI option to set max_token to be generated by @phisad in #48

Backends

Added Huggingface backend and documentation (which also introduces a hf local model registry) by @Gnurro in #2 #3 #17 #18 #29 #31 #32 #45 #46
Added Cohere API by @sherzod-hakimov
Added Mistral API by @sherzod-hakimov
Added generic openai compatible API by @davidschlangen

Documentation

Improved docs on how to run the benchmark by @davidschlangen @sherzod-hakimov @briemadu
Improved docs on how to add and prototype games by @phisad @briemadu @davidschlangen
Improved docs on how to log games by @briemadu
Improved docs on how to add a backend by @briemadu #21
Improved docs on how to run games locally by @sherzod-hakimov @AnneBeyer

Dependencies

bumped accelerate to 0.25.0
bumped transformers to 4.36.0
bumped openai to 1.7.0
bumped anthropic to 0.3.0
added cohere 4.34
added mistralai 0.0.12

... and other small changes.

Full Changelog: v0.2...v0.3

Contributors

davidschlangen, phisad, and 5 other contributors

Assets 2

05 Jul 14:26

phisad

Version 0.2

Submission version

Assets 2