test prompt change and temperature change #106

pamelafox · 2024-10-22T20:16:54Z

Purpose

Test evaluation

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[ ] Yes
[ ] No

Type of change

[ ] Bugfix
[ ] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

The current tests all pass (python -m pytest).
I added tests that prove my fix is effective or that my feature works
I ran python -m pytest --cov to verify 100% coverage of added lines
I ran python -m mypy to check for type errors
I either used the pre-commit hooks or ran ruff manually on my code.

pamelafox · 2024-10-22T20:17:20Z

/evaluate

github-actions · 2024-10-22T20:17:35Z

Starting evaluation! Check the Actions tab for progress, or wait for a comment with the results.

github-actions · 2024-10-22T20:20:53Z

metric	stat	baseline	pr106
gpt_groundedness	mean_rating	5.0	5.0
↑	pass_rate	1.0	1.0
gpt_relevance	mean_rating	5.0	5.0
↑	pass_rate	1.0	1.0
answer_length	mean	978.9	991.5
latency	mean	2.51	2.05
citation_match	rate	1.0	1.0
num_questions	total	10	10

pamelafox · 2024-10-22T20:26:34Z

/evaluate

github-actions · 2024-10-22T20:26:47Z

Starting evaluation! Check the Actions tab for progress, or wait for a comment with the results.

github-actions · 2024-10-22T20:29:28Z

metric	stat	baseline	pr106
gpt_groundedness	mean_rating	5.0	5.0
↑	pass_rate	1.0	1.0
gpt_relevance	mean_rating	5.0	5.0
↑	pass_rate	1.0	1.0
answer_length	mean	978.9	1051.1
latency	mean	2.51	2.18
citation_match	rate	1.0	1.0
num_questions	total	10	10

pamelafox · 2024-10-22T20:35:03Z

/evaluate

github-actions · 2024-10-22T20:35:24Z

Starting evaluation! Check the Actions tab for progress, or wait for a comment with the results.

github-actions · 2024-10-22T20:39:00Z

metric	stat	baseline	pr106
gpt_groundedness	mean_rating	5.0	5.0
↑	pass_rate	1.0	1.0
gpt_relevance	mean_rating	5.0	5.0
↑	pass_rate	1.0	1.0
answer_length	mean	978.9	984.9
latency	mean	2.51	2.13
citation_match	rate	1.0	1.0
num_questions	total	10	10

pamelafox added 2 commits October 22, 2024 20:15

Modify prompt

c9fd5a2

Modify prompt

f2c6df7

Change temp in eval config

be04034

Try to make it worse

9ba9aea

pamelafox closed this Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test prompt change and temperature change #106

test prompt change and temperature change #106

pamelafox commented Oct 22, 2024

pamelafox commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

pamelafox commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

pamelafox commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

test prompt change and temperature change #106

test prompt change and temperature change #106

Conversation

pamelafox commented Oct 22, 2024

Purpose

Does this introduce a breaking change?

Type of change

Code quality checklist

pamelafox commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

pamelafox commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

pamelafox commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

github-actions bot commented Oct 22, 2024