-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
copa+…As a result, C1 or C2? prompting error #65
Comments
Hey, Stella! Can you make sure you're using the latest Charlie suggested the following in Slack: pip uninstall "promptsource @ git+https://github.com/bigscience-workshop/promptsource@eval-hackathon"
pip install "promptsource @ git+https://github.com/bigscience-workshop/promptsource@eval-hackathon" |
Update: Some of the |
@jon-tow sorry I didn’t see this response. I’m a little confused by your loss recent comment… if that conditional fails, does that mean that there isn’t a prompt? In the event that your try-catch code activates, what does the prompt end up being? |
Yes because
The prompt ends up being the filled-in template described in the consequent block of the conditional. Repro: https://colab.research.google.com/drive/1PlQx_2fkyZxHECqRZ8aTGdA0RiFQgvQS?usp=sharing |
I am dissatisfied with this solution but haven't had the bandwidth to work on it at all. Leaving this comment mostly for my future reference. |
Also getting this one [default0]:
[default0]: Assigning unique IDs to 'copa+ ^` As a result, C1 or C2?' docs
[default0]:
[default0]: 0%| | 0/100 [00:00<?, ?ex/s]
[default0]:100%| ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h| 100/100 [00:00<00:00, 15571.37ex/s]
[default0]: Filtering invalid docs from 'copa+ ^` As a result, C1 or C2?'
[default0]:
[default0]: 0%| | 0/1 [00:00<?, ?ba/s][default0]:
[default0]:100%| ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h| 1/1 [00:00<00:00, 4.48ba/s]
[default0]:100%| ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h ^v^h| 1/1 [00:00<00:00, 4.48ba/s]
[default0]: Constructing 'copa+ ^` As a result, C1 or C2?' contexts and requests
[default0]:
[default0]: 0%| | 0/100 [00:00<?, ?it/s][default0]:
[default0]: 0%| | 0/100 [00:00<?, ?it/s]
[default0]:Traceback (most recent call last):
[default0]: File "./tasks/eval_harness/evaluate_bsevalharness.py", line 499, in <module>
[default0]: main()
[default0]: File "./tasks/eval_harness/evaluate_bsevalharness.py", line 479, in main
[default0]: results = evaluator.evaluate(lm=adaptor, task_dict={task_name: task}, bootstrap_iters=args.bootstrap_iters, rng=np.random.default_rng(args.seed))
[default0]: File "/gpfsssd/scratch/rech/six/commun/experiments/muennighoff/lm-evaluation-harness/lm_eval/evaluator.py", line 180, in evaluate
[default0]: ctx, fewshotex_logging_info = task.fewshot_context(
[default0]: File "/gpfsssd/scratch/rech/six/commun/experiments/muennighoff/lm-evaluation-harness/lm_eval/api/task.py", line 404, in fewshot_context
[default0]: prompt = self.doc_to_text(doc)
[default0]: File "/gpfsssd/scratch/rech/six/commun/experiments/muennighoff/lm-evaluation-harness/lm_eval/api/task.py", line 286, in doc_to_text
[default0]: text, _ = self.prompt_template.apply(doc)
[default0]:ValueError: not enough values to unpack (expected 2, got 1) |
@Muennighoff Thanks for reporting! We can't robustly resolve this until it gets fixed on the NOTE: you may need to clear the cache files in your HF |
Ah yeah the issue comes from prompts like the below where only cause or only effect work. When the prompt is the other type, it returns a list with just one element, thus erroring out.. jinja: "{% if question == \"effect\" %} \n{{ premise }} As a result, \"{{ answer_choices[0]\
\ }}\" or \"{{ answer_choices[1] }}\"? ||| {% if label != -1 %}{{ answer_choices[label]\
\ }}{%endif%}\n{% endif %}" |
Sorry for the slow reply! This is something we're still working out the finer points with. It's related to bigscience-workshop/promptsource#749 and bigscience-workshop/promptsource#792. I don't really have a good answer right now, but in the T0 codebase I believe we had to manually check for this case. |
The text was updated successfully, but these errors were encountered: