Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add seed parameter (optional) and custom evaluation metric for citations overlap #122

Merged
merged 2 commits into from
Oct 23, 2024

Conversation

pamelafox
Copy link
Contributor

@pamelafox pamelafox commented Oct 23, 2024

Purpose

This pull request introduces several changes to add a new metric for evaluating citation matches and to incorporate a random seed for reproducibility in various functions. The most important changes include adding the new CitationsMatchedMetric class, updating configuration files to include the new metric, and modifying functions to accept and use a seed parameter.

New Metric Addition:

  • evals/eval_config.json: Updated the requested_metrics to include "citations_matched" and added a seed parameter in target_parameters.
  • evals/evaluate.py: Added the CitationsMatchedMetric class to compute the percentage of citations matched between the response and the ground truth. Registered this new metric in the evaluation pipeline. [1] [2]

Seed Parameter for Reproducibility:

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[ ] Yes
[X] No

Type of change

[ ] Bugfix
[X] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

  • The current tests all pass (python -m pytest).
  • I added tests that prove my fix is effective or that my feature works
  • I ran python -m pytest --cov to verify 100% coverage of added lines
  • I ran python -m mypy to check for type errors
  • I either used the pre-commit hooks or ran ruff manually on my code.

@pamelafox pamelafox merged commit 469474d into main Oct 23, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant