Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need some context for certain args for Instruct Human Eval #256

Open
teknium1 opened this issue Jul 18, 2024 · 2 comments
Open

Need some context for certain args for Instruct Human Eval #256

teknium1 opened this issue Jul 18, 2024 · 2 comments

Comments

@teknium1
Copy link

Hey all, what is the n_samples for instruct human eval about?

The docs say 200 as if its a static setting that should be used, but I can't understand why.

Also, when the turns format is structured, does this look right for chatml?

--instruction_tokens "<|im_start|>user\n","<|im_end|>\n","<|im_start|>assistant\n"

without quoting each string it gave an error so I assume this is how to use this arg?

@teknium1
Copy link
Author

I ran the instruct humaneval benchmark and the eval results.json shows this:

"eos": "<|endoftext|>",

whereas the actual EOS should be <|im_end|> - don't really see why this is there

@Muennighoff
Copy link
Contributor

I recommend just running humanevalsynthesize from humanevalpack which offers the same + more - instructions for running are here: https://github.com/bigcode-project/octopack?tab=readme-ov-file#run & here https://github.com/bigcode-project/bigcode-evaluation-harness/blob/main/docs/README.md#humanevalpack

you may just need to add your instruction format here:

prompt = f"<|user|>\n{inp}\n<|assistant|>\n{prompt_base}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants