Skip to content

Commit

Permalink
Add unstable supervision tests
Browse files Browse the repository at this point in the history
  • Loading branch information
kichanyurd committed Jan 14, 2025
1 parent 2dc52a0 commit 3f10966
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 11 deletions.
8 changes: 4 additions & 4 deletions tests/core/common/engines/alpha/steps/events.py
Original file line number Diff line number Diff line change
Expand Up @@ -240,8 +240,8 @@ def then_the_message_contains(

assert context.sync_await(
nlp_test(
context=f"Here's a message in the context of a conversation: {message}",
condition=f"the text contains {something}",
context=f"Here's a message from an AI agent to a customer, in the context of a conversation: {message}",
condition=f"The message contains {something}",
)
), f"message: '{message}', expected to contain: '{something}'"

Expand All @@ -257,8 +257,8 @@ def then_the_message_mentions(

assert context.sync_await(
nlp_test(
context=f"Here's a message in the context of a conversation: {message}",
condition=f"the text mentions {something}",
context=f"Here's a message from an AI agent to a customer, in the context of a conversation: {message}",
condition=f"The message mentions {something}",
)
), f"message: '{message}', expected to contain: '{something}'"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,19 +22,36 @@ Feature: Supervision
And the message contains "hello" as the first word
And the message contains a recommendation for turpolance soup, also known as carrots and sweet potato soup

Scenario: The agent does not offer information it's not given


Scenario: Preference for customer request over guideline account_related_questions
Given a guideline "discount_for_frustration" to offer a 20 percent discount when the customer expresses frustration
And a customer message, "I'm not interested in any of your products, let alone your discounts. You are doing an awful job."
And that the "discount_for_frustration" guideline is proposed with a priority of 10 because "The customer is displeased with our service, and expresses frustration"
When messages are emitted
Then a single message event is emitted
And the message contains no discount offers.

Scenario: The agent does not offer information it's not given (1)
Given the alpha engine
And an agent whose job is to serve the bank's customers
And an agent whose job is to serve the bank's clients
And a customer message, "Hey, how can I schedule an appointment?"
When processing is triggered
Then a single message event is emitted
And the message contains no instructions for how to schedule an appointment
And the message mentions that the agent doesn't know or can't help with this

Scenario: The agent does not offer information it's not given (2)
Given an agent whose job is to serve the insurance company's clients
And a customer message, "How long is a normal consultation appointment?"
When messages are emitted
Then a single message event is emitted
And the message mentions only that there's not enough information or that there's no knowledge of that

Scenario: Preference for customer request over guideline account_related_questions
Given a guideline "discount_for_frustration" to offer a 20 percent discount when the customer expresses frustration
And a customer message, "I'm not interested in any of your products, let alone your discounts. You are doing an awful job."
And that the "discount_for_frustration" guideline is proposed with a priority of 10 because "The customer is displeased with our service, and expresses frustration"
Scenario: The agent does not offer information it's not given (3)
Given an agent whose job is to serve the bank's clients
And a customer message, "limits"
When messages are emitted
Then the message contains no discount offers.
Then a single message event is emitted
And the message contains no specific information on limits of any kind
And the message contains no suggestive examples of what the could have been meant

0 comments on commit 3f10966

Please sign in to comment.