Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval results aren't matching the paper #29

Open
sia-cerebras opened this issue Jan 24, 2024 · 1 comment
Open

Eval results aren't matching the paper #29

sia-cerebras opened this issue Jan 24, 2024 · 1 comment

Comments

@sia-cerebras
Copy link

I'm not able to match the 3-shot eval results reported in the paper for the pretrained model.
I downloaded the Meditron-7b model from HF.
For example, for MedQA I get 0.353, while the paper reports 0.287±0.008
My command was: ./inference_pipeline.sh -b medqa4 -c meditron-7b -s 3 -m 0 -out_dir out_dir

On PubMedQA, I got 0.486, but the paper reports .693±.151.

@jfernandrezj
Copy link

Just checking if anybody is interested, I run my own eval and it is worse than Mistral:7b on TNM coding (by far)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants