gricean-pragmatics

A library of five metrics evaluating large language models' pragmatic competence:

Naturalness: LLMs will generate surprisal scores as a proxy to text naturalness for each sentence in a minimal pair, which reflect how unexpected a sentence is, given the preceding context. Hypothetically, if LLMs show pragmatic sensitivity, LLMs should assign a lower surprisal score to the intended implied meaning in an appropriate context.
Sensitivity to different Shades of Meaning (SSM)
Pragmatic Reasoning Chains (PRC)
Implicature Recovery Rate (IRR)
Pragmatic Sensitivity Index (PSI)

Benchmark datasets (work-in-progress), examples, and documentation are also provided.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
bin		bin
datasets		datasets
docs		docs
examples		examples
gricean_pragmatics		gricean_pragmatics
scripts		scripts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
mkdocs.yaml		mkdocs.yaml
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback