Skip to content
View threeColorFr's full-sized avatar

Organizations

@cocacola-lab

Block or report threeColorFr

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

LLM-eval

7 repositories

SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese

3,046 98 Updated May 23, 2024

Supercharge Your LLM Application Evaluations 🚀

Python 7,691 782 Updated Dec 26, 2024

Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]

Python 1,655 79 Updated Oct 26, 2023

The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.

724 47 Updated May 8, 2024

A unified evaluation framework for large language models

Python 2,493 185 Updated Oct 28, 2024

The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".

1,454 92 Updated Jun 3, 2024

中文大模型能力评测榜单:目前已囊括128个大模型,覆盖chatgpt、gpt-4o、谷歌gemini、百度文心一言、阿里通义千问、百川、讯飞星火、商汤senseChat、minimax等商用模型, 以及qwen2.5、llama3.1、glm4、书生internLM2.5、openbuddy、AquilaChat等开源大模型。不仅提供能力评分排行榜,也提供所有模型的原始输出结果!

3,108 138 Updated Dec 25, 2024