Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean Upvote and Downvote data #3611

Merged
merged 10 commits into from
Nov 13, 2024
Merged

Clean Upvote and Downvote data #3611

merged 10 commits into from
Nov 13, 2024

Conversation

CodingWithTim
Copy link
Collaborator

@CodingWithTim CodingWithTim commented Nov 8, 2024

Why are these changes needed?

Related issue number (if applicable)

Checks

  • I've run format.sh to lint the changes in this PR.
  • I've included any doc changes needed.
  • I've made sure the relevant tests are passing (if applicable).

@CodingWithTim
Copy link
Collaborator Author

> python clean_chat_data.py --action-type upvote

Will grab all the upvotes.

Copy link
Member

@infwinston infwinston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CodingWithTim left some comments

fastchat/serve/monitor/clean_chat_data.py Outdated Show resolved Hide resolved
fastchat/serve/monitor/clean_chat_data.py Outdated Show resolved Hide resolved
@CodingWithTim
Copy link
Collaborator Author

@infwinston Suggestions integrated.

Copy link
Member

@infwinston infwinston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a critical thing should be fixed. chunk_size shouldn't be 1 otherwise we will introduce lots of overhead and it might be even slower than sequential implementation.

fastchat/serve/monitor/clean_chat_data.py Outdated Show resolved Hide resolved
fastchat/serve/monitor/clean_chat_data.py Outdated Show resolved Hide resolved
Comment on lines 158 to 167
# Aggregate results from child processes
ct_invalid_conv_id = sum(
[data["ct_invalid_conv_id"] for data in results if "ct_invalid_conv_id" in data]
)
ct_invalid = sum([data["ct_invalid"] for data in results if "ct_invalid" in data])
ct_network_error = sum(
[data["ct_network_error"] for data in results if "ct_network_error" in data]
)
all_models = set([data["model"] for data in results if "model" in data])
chats = [data["result"] for data in results if "result" in data]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we merge into one for loop?

@infwinston infwinston merged commit 5ac9372 into main Nov 13, 2024
2 checks passed
@infwinston infwinston deleted the clean-upvote-downvote branch November 13, 2024 01:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants