Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Data Explorer Tab #3632

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Add Data Explorer Tab #3632

wants to merge 10 commits into from

Conversation

ygtangg
Copy link

@ygtangg ygtangg commented Nov 26, 2024

Why are these changes needed?

This Data Explorer provides a visually engaging and interactive tool that allows users to explore and draw insights from the leaderboard (conversation) data. It fosters transparency in the ranking process and enhances users’ trust in our leaderboard.
(link to the explorer website: link)
Screenshot 2024-11-25 at 17 58 20
Screenshot 2024-11-25 at 17 58 40

Related issue number (if applicable)

Checks

  • I've run format.sh to lint the changes in this PR.
  • I've included any doc changes needed.
  • I've made sure the relevant tests are passing (if applicable).

@aangelopoulos
Copy link
Collaborator

You're amazing! So fast!

@aangelopoulos
Copy link
Collaborator

Really wonderful job :) will review

@aangelopoulos
Copy link
Collaborator

Looks wonderful. One bug: Dark mode causes issues because the text doesn't turn white. Can you fix? @ygtangg

image

@aangelopoulos aangelopoulos self-requested a review November 26, 2024 20:21
@ygtangg
Copy link
Author

ygtangg commented Nov 27, 2024

I fixed the dark mode issue on the website. No change to the iframe link is needed.

@aangelopoulos
Copy link
Collaborator

Where is fix? I don't see another commit or PR.

@ygtangg
Copy link
Author

ygtangg commented Nov 27, 2024

I changed the html website, whose files are in the arena-leaderboard-v2 repo. Do you think we should move it in here as well?

@aangelopoulos
Copy link
Collaborator

Oh got it!
Good question. Let's separate for now, and we can merge at some point later.

@aangelopoulos
Copy link
Collaborator

LGTM. @infwinston ?

@lisadunlap
Copy link
Collaborator

this is suuuuper cool! @aangelopoulos @infwinston lmk if yall need any help to get this merged, this will be an amazing feature

Comment on lines 99 to 104
model_keys = ['chatgpt-4o-latest', 'gemini-1.5-pro-exp-0827','gpt-4o-mini-2024-07-18','claude-3-5-sonnet-20240620','gemini-1.5-flash-exp-0827','llama-3.1-405b-instruct','gemini-1.5-pro-api-0514','mistral-large-2407','reka-core-20240722','gemini-1.5-flash-api-0514', 'deepseek-coder-v2-0724','yi-large','llama-3-70b-instruct','qwen2-72b-instruct','claude-3-haiku-20240307','llama-3.1-8b-instruct','mistral-large-2402','command-r','mixtral-8x22b-instruct-v0.1','gpt-3.5-turbo-0613']
output_tokens_per_USD = [66.66666667000001,200.0,1666.666667,66.66666667000001,3333.333333,333.3333333,200.0,166.6666667,166.6666667,3333.333333,3333.333333,333.3333333,1265.8227849999998,1111.111111,800.0,11111.11111,166.6666667,666.6666667,166.6666667,500.0]
score=[1316.1559008799543,1300.8583398843484,1273.6004783067303,1270.113546648134,1270.530573909608,1266.244657076764,1259.2844314017723,1249.8268751367714,1229.2148108171098,1226.8769924152105,1214.5634252743123,1212.4668382698005,1206.3236747009742,1186.7832147344182,1178.5484948812955,1167.8793593807711,1157.271872307139,1148.6665817312062,1147.0325504217642,1117.0289441863001]
fig = px.scatter(x=output_tokens_per_USD, y=score, title="Quality vs. Cost Effectiveness", labels={
"output_tokens_per_USD": "# of output tokens per USD (in thousands)",
"score": "Arena Score"}, log_x=True, text=model_keys)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's probably not push this info here. it'll be hard to maintain/update.

Copy link

@sophie200 sophie200 Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes yes, will upload plotly graph to the google storage and then embed with iframe

@sophie200 sophie200 force-pushed the main branch 6 times, most recently from 0b80433 to 4730e8d Compare December 19, 2024 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants