Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add UltraLM-13b-V2.0/UltraLM-13b-V2.0-best-of-16/UltraLM-13b-best-of-16 to AlpacaEval #139

Merged
merged 3 commits into from
Sep 30, 2023

Conversation

lifan-yuan
Copy link
Contributor

No description provided.

ultralm-13b:
prompt_template: "ultralm-13b-best-of-16/prompt.txt"
fn_completions: "huggingface_local_completions"
completions_kwargs:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you not have to change the completion function to do best-of-16?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Yann,
I kept this "fn_completions" as other models since there is no instruction guiding what options are allowed in this field. Can I add a comment in the line, such as fn_completions: "huggingface_local_completions" # best-of-16 sampling? or can I change 'huggingface_local_completions' to something like 'huggingface_best_of_16_completions'?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is meant to be used to be able to replicate your results. All the options can be found here: https://github.com/tatsu-lab/alpaca_eval/blob/main/src/alpaca_eval/decoders/__init__.py

in you case, it seems that you cannot replicate your results with our code. Are all the parameters completions_kwargs correct then? GIven that you can't replicate your results with our code, I'd suggest just removing fn_completions and completions_kwargs, and adding a comment in the yml that says that results can't be reproduced in alpaca_eval because they require best of n sampling

@YannDubs
Copy link
Collaborator

Hi @lifan-yuan, can you upload the model outputs also?

@lifan-yuan
Copy link
Contributor Author

Hi @lifan-yuan, can you upload the model outputs also?

Sorry for the missing files. I will create a new commit adding the output files after we find out the right way to represent the best-of-16 sampling.

@lifan-yuan
Copy link
Contributor Author

Hi @YannDubs, I've added a comment in the yml file and explained how the results can be reproduced. The model_outputs are updated as well.

@YannDubs YannDubs merged commit 59a9cab into tatsu-lab:main Sep 30, 2023
@YannDubs
Copy link
Collaborator

LGTM! Great job!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants