Add seed parameter (optional) and custom evaluation metric for citations overlap #122
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
This pull request introduces several changes to add a new metric for evaluating citation matches and to incorporate a random seed for reproducibility in various functions. The most important changes include adding the new
CitationsMatchedMetric
class, updating configuration files to include the new metric, and modifying functions to accept and use a seed parameter.New Metric Addition:
evals/eval_config.json
: Updated therequested_metrics
to include"citations_matched"
and added aseed
parameter intarget_parameters
.evals/evaluate.py
: Added theCitationsMatchedMetric
class to compute the percentage of citations matched between the response and the ground truth. Registered this new metric in the evaluation pipeline. [1] [2]Seed Parameter for Reproducibility:
src/backend/fastapi_app/api_models.py
: Added aseed
field to theChatRequestOverrides
model.src/backend/fastapi_app/rag_advanced.py
: Modified multiple functions (generate_search_query
,prepare_context
,answer
) to accept and use theseed
parameter. [1] [2] [3] [4]src/backend/fastapi_app/rag_base.py
: Updated theget_params
method to include theseed
parameter.src/backend/fastapi_app/rag_simple.py
: Added theseed
parameter to theanswer
andanswer_stream
methods. [1] [2]Does this introduce a breaking change?
When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.
Type of change
Code quality checklist
See CONTRIBUTING.md for more details.
python -m pytest
).python -m pytest --cov
to verify 100% coverage of added linespython -m mypy
to check for type errorsruff
manually on my code.