Skip to content
This repository has been archived by the owner on Jul 23, 2024. It is now read-only.

[Task Submission] Quantifier Understanding (quantifier_understanding) #20

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

lerow
Copy link

@lerow lerow commented Aug 1, 2023

Quantifier Understanding

The task evaluates generalization in the understanding of quantifiers. It aims to measure
how well can language models capture the semantics of logical quantifiers in natural language.

Authors

Implementation

The task re-implements the evaluation function to compute accuracy scores.

Usage

Given predictions and gold labels, evaluate_predictions() outputs the accuracy score.

Checklist

  • I and my co-authors agree that, if this PR is merged, the code will be available under the same license as the genbench_cbt repository.
  • Prior to submitting, I have ran the GenBench CBT test suite using the genbench-cli test-task tool.
  • I have read the description of what should be in the doc.md of my task, and have added the required arguments.
  • I have submitted or will submit an accompanying paper to the GenBench workshop.

@kazemnejad
Copy link
Contributor

Thanks for submitting a task to GenBench. Please be aware that you're submitting the data files of the tasks in your PR. For the final submission, you will need to host the dataset files somewhere else (preferably as a Huggingface dataset). Additionally, in your current submission, the data_source field in the config.jsonnet does not contain the correct url. It should be probably https://raw.githubusercontent.com/lerow/genbench_cbt/quantifier_understanding/src/genbench/tasks/quantifier_understanding/test_data.jsonl
This is why the tests are failing.

@vernadankers
Copy link
Contributor

vernadankers commented Sep 1, 2023

Hello!

We are getting quite close to the deadline (September 1, 11:59PM anywhere on earth), which is why I wanted to remind you of the fact that your PR still needs some attention: please double-check the automated checks that failed, and ensure that the dataset files are hosted somewhere else; see Amir's message above.

Please don't forget to submit your accompanying paper to Openreview via https://openreview.net/group?id=GenBench.org/2023/Workshop by September 1.

Good luck finalising your PR and paper, feel free to tag us if you have questions.
Cheers, Verna
On behalf of the GenBench team

@lerow
Copy link
Author

lerow commented Sep 1, 2023

Hi,
Thanks for the reminder. I just fixed the URL - the tests should pass now.
Leroy

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants