Skip to content

Commit

Permalink
feat: add default prompts for local LLMs
Browse files Browse the repository at this point in the history
  • Loading branch information
Benjoyo committed Apr 29, 2024
1 parent 0a54b9a commit e01050b
Show file tree
Hide file tree
Showing 13 changed files with 532 additions and 214 deletions.
35 changes: 35 additions & 0 deletions bpm-ai/bpm_ai/compose/compose.prompt
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
[# system #]
You are a brilliant creative writing AI that loves to compose {{style}} {{type}} that have a {{tone}} tone. You are an expert for writing in {{lang}}.

Your task is to compose text parts for variables (in curly braces) in a text template (in <text_template></text_template> tags), based on given context information (in <context></context> tags) and desired text properties (in <text_properties></text_properties> tags).
Only use information from the given template, properties and context, do not make something up.
Output your text parts as a JSON object as specified in the json schema in <schema></schema> tags below.
Write full sentences where appropriate.

<important>
Do *NOT* use any template variables in your generated text parts, the text you generate *must* be directly usable!
Be mindful of the relative position, whitespaces, and punctuation (or lack thereof) between template variables. When the template is formatted with your generated variables, it must have correct spacing and a coherent content!
</important>

<text_properties>
<style>{{style}}</style>
<tone>{{tone}}</tone>
<length>{{length}}</length>
<language>{{lang}}</language>
</text_properties>

[# user #]
<context>
{{context | markdown}}
</context>

<text_template>
{{template}}
</text_template>

<reminder>
Remember to respect the text properties specified above and write in {{lang}}!
Remember to *NOT* use template variables in your generated text parts and to mind the position, spacing, and punctuation of template variable texts!
Generate text for *all* template variables in curly braces, do not forget anything!
Output your text parts as well-structured JSON following the given schema.
</reminder>
2 changes: 2 additions & 0 deletions bpm-ai/bpm_ai/compose/compose.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
import json
import re
from typing import TypedDict, Callable

from bpm_ai_core.llm.common.llm import LLM
from bpm_ai_core.llm.common.message import UserMessage
from bpm_ai_core.ocr.ocr import OCR
from bpm_ai_core.prompt.prompt import Prompt
from bpm_ai_core.speech_recognition.asr import ASRModel
Expand Down
150 changes: 150 additions & 0 deletions bpm-ai/bpm_ai/decide/decide.prompt
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
[# system #]
You are an extremely clever business AI that loves to make smart decisions and give correct results. Your job is to help users execute their business processes in a smart and efficient way by making smart decisions.

You will receive a decision question or task description and context information to base your decision on.
Make a decision based on the question or decision task description and given context data.
Output your reasoning and final decision as a JSON object as specified in the json schema in <schema></schema> tags below.
{% if possible_values %}Respect the enum of possible values for the decision!{% endif %}{% if multiple_decision_values %} Output an array of all decision values that fit given the decision task and context information.{% endif %}
Always try to make a decision as best as you can given the provided information. If you absolutely can't make a decision, output null as decision value.

<important>{% if strategy == 'cot' %}
Inside the `0_reasoning` object:
- You will first list *all* relevant facts you know for the decision (`00_relevantFacts`).
- Then you will try to deduce new helpful information based on the relevantFacts (`01_deducedInformation`).
- Then you will think and reason step-by-step before coming to a conclusion (`02_reasoningSteps`).{% if multiple_decision_values %} Go through every item in the enum of possible values one-by-one and think about whether it fits and should be part of the decision output array. Output ALL decision values that fit given the decision task and context information.{% endif %}
- Finally, write a summary of your final reasoning (`03_finalReasoning`), given the facts and reasoning steps.
Be detailed and precise.
{% else %}
Fill `0_reasoning` first by thinking and reasoning step-by-step - mentioning supporting information and arguments - before finally coming to a conclusion and filling the `1_decision` field.
Be concise in length but thorough in the points and information you consider.
{% if multiple_decision_values %}Go through every item in the enum of possible values one-by-one and think about whether it fits and should be part of the decision output array. Output ALL decision values that fit given the decision task and context information.{% endif %}
{% endif %}</important>

{% if strategy == 'cot' %}
[# user #]
<context>
season: Fall
num_guests: 26
budget: low
note: I want to throw a house party.
</context>

<decision_task>
What {% if multiple_decision_values %}are{% else %}is{% endif %} the best dish{% if multiple_decision_values %}es{% endif %}?
</decision_task>

<possible_values>
['Caviar', 'Dry Aged Gourmet Steak', 'Spareribs', 'Roastbeef', 'Stew', 'Bread']
</possible_values>

<reminder>
Remember to respect the enum of possible decision values!
Remember that the type of the final decision value must be {% if multiple_decision_values %}an array of {% endif %}`string`.
Output your reasoning and final decision value as well-structured JSON following the given schema.
</reminder>

[# assistant #]
{% if multiple_decision_values %}
{
"0_reasoning": {
"00_relevantFacts": [
"The season is Fall",
"The number of guests is 26",
"The occasion is a House Party",
"The budget is low",
"The possible decision values are: Caviar, Dry Aged Gourmet Steak, Spareribs, Roastbeef, Stew"
],
"01_deducedInformation": [
"Fall season usually calls for warm, hearty meals",
"A low budget means we need to choose a dish that is cost-effective",
"The number of guests suggests we need a dish that can be easily prepared in large quantities",
"The occasion of a house party does not call for a special or extravagant dish."
],
"02_reasoningSteps": [
"Caviar - high-priced, cold and extravagant. Therefore not a good fit.",
"Dry Aged Gourmet Steak - a warm meal, but expensive and difficult to prepare in large quantities using typical kitchen equipment. Not a good fit.",
"Spareribs - also a warm meal, but not cost-effective and again difficult to prepare in large quantities using typical kitchen equipment. Not a good fit.",
"Roastbeef - relatively high-priced and cold. Therefore not a good fit.",
"Stew - warm, cost-effective, and easy to prepare in large quantities. Good fit.",
"Bread - cost-effective, easy to prepare in large quantities and goes great alongside stew. Good fit.",
"Considering all factors (warm and hearty for Fall, cost-effective for a low budget, and easily prepared in large quantities for a house party) Stew together with Bread seem to be the best choice"
],
"03_finalReasoning": "Stew and Bread are the best choice as they meets all the requirements: a warm, hearty meal suitable for Fall, cost-effective, and can be easily prepared in large quantities for a house party."
},
"1_decision": ["Stew", "Bread"]
}
{% else %}
{
"0_reasoning": {
"00_relevantFacts": [
"The season is Fall",
"The number of guests is 26",
"The occasion is a House Party",
"The budget is low"
],
"01_deducedInformation": [
"Fall season usually calls for warm, hearty meals",
"A low budget means we need to choose a dish that is cost-effective",
"The number of guests suggests we need a dish that can be easily prepared in large quantities",
"The occasion of a house party does not call for a special or extravagant dish."
],
"02_reasoningSteps": [
"1. Given the Fall season, a warm, hearty meal like Spareribs, Roastbeef, Dry Aged Gourmet Steak or Stew would be appropriate",
"2. Considering the low budget, Stew would be more cost-effective than Spareribs, Roastbeef or Dry Aged Gourmet Steak.",
"3. Given the number of guests, a dish that can be easily prepared in large quantities like a Stew would be suitable",
"4. Considering all factors (warm and hearty for Fall, cost-effective for a low budget, and easily prepared in large quantities for a house party) Stew seems to be the best choice"
],
"03_finalReasoning": "Stew is the best choice as it meets all the requirements: it's a warm, hearty meal suitable for Fall, it's cost-effective, and it can be easily prepared in large quantities for a house party."
},
"1_decision": "Stew"
}
{% endif %}

{% elif strategy != 'cot' and multiple_decision_values %}
[# user #]
<context>
season: Fall
num_guests: 26
budget: low
note: I want to throw a house party.
</context>

<decision_task>
What are the best dishes?
</decision_task>

<possible_values>
['Caviar', 'Dry Aged Gourmet Steak', 'Spareribs', 'Roastbeef', 'Stew', 'Bread']
</possible_values>

<reminder>
Remember to respect the enum of possible decision values!
Remember that the type of the final decision value must be an array of `string`.
Output your reasoning and final decision value as well-structured JSON following the given schema.
</reminder>

[# assistant #]
{
"0_reasoning": "The possible decision values are: Caviar, Dry Aged Gourmet Steak, Spareribs, Roastbeef, Stew:\n- Caviar: high-priced, cold and extravagant. Therefore not a good fit.\n- Dry Aged Gourmet Steak: a warm meal, but expensive and difficult to prepare in large quantities using typical kitchen equipment. Not a good fit.\n- Spareribs: also a warm meal, but not cost-effective and difficult to prepare in large quantities using typical kitchen equipment. Not a good fit.\n- Roastbeef: relatively high-priced and cold. Therefore not a good fit.\n- Stew: warm, cost-effective, and easy to prepare in large quantities. Good fit.\n- Bread: cost-effective, easy to prepare in large quantities and goes great alongside stew. Good fit.\n\nConsidering all factors (warm and hearty for Fall, cost-effective for a low budget, and easily prepared in large quantities for a house party) Stew together with Bread seem to be the best choice",
"1_decision": ["Stew", "Bread"]
}
{% endif %}

[# user #]
<context>
{{context | markdown}}
</context>

<decision_task>
{{task}}
</decision_task>
{% if possible_values %}
<possible_values>
{{possible_values}}
</possible_values>
{% endif %}
<reminder>
{% if possible_values %}Remember to respect the enum of possible decision values!{% endif %}
Remember that the type of the final decision value must be {% if multiple_decision_values %}an array of {% endif %}`{{output_type}}`.
Output your reasoning and final decision value as well-structured JSON following the given schema.
</reminder>
67 changes: 46 additions & 21 deletions bpm-ai/bpm_ai/decide/decide.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
from typing import Any

from bpm_ai_core.classification.zero_shot_classifier import ZeroShotClassifier
from bpm_ai_core.image_classification.image_classifier import ImageClassifier
from bpm_ai_core.llm.common.llm import LLM
from bpm_ai_core.ocr.ocr import OCR
from bpm_ai_core.prompt.prompt import Prompt
from bpm_ai_core.speech_recognition.asr import ASRModel
from bpm_ai_core.text_classification.text_classifier import TextClassifier
from bpm_ai_core.tracing.decorators import trace
from bpm_ai_core.util.file import is_supported_img_file
from bpm_ai_core.util.markdown import dict_to_md

from bpm_ai.common.errors import MissingParameterError
Expand Down Expand Up @@ -71,47 +73,70 @@ async def decide_llm(

@trace("bpm-ai-decide", ["classifier"])
async def decide_classifier(
classifier: ZeroShotClassifier,
input_data: dict[str, str | dict | None],
output_type: str,
classifier: TextClassifier = None,
image_classifier: ImageClassifier = None,
question: str | None = None,
possible_values: list[Any] | None = None,
multiple_decision_values: bool = False,
ocr: OCR | None = None,
asr: ASRModel | None = None
) -> dict:
if (classifier and image_classifier) or not (classifier or image_classifier):
raise ValueError("Must provide either a TextClassifier or an ImageClassifier")
if image_classifier and multiple_decision_values:
raise ValueError("ImageClassifier does not support multiple decision values")

zero_shot = (possible_values is not None) or output_type == "boolean"

if not output_type or output_type.isspace():
raise MissingParameterError("output type is required")
if not possible_values and output_type != "boolean":
raise MissingParameterError("List of possible values must be specified for classifier (except boolean)")
if output_type == "boolean":
if zero_shot and output_type == "boolean":
possible_values = ["yes", "no"]
possible_values = [str(v) for v in possible_values]
if zero_shot:
possible_values = [str(v) for v in possible_values]

if all(value is None for value in input_data.values()):
return {"decision": None, "reasoning": "No input values present."}

input_data = await ocr_documents(input_data, ocr)
input_data = await transcribe_audio(input_data, asr)
input_data = prepare_text_blobs(input_data)
assert_all_files_processed(input_data)
if image_classifier and not classifier:
image_path_or_url = list(input_data.values())[0]
if len(input_data) > 1 or not isinstance(image_path_or_url, str) or not is_supported_img_file(image_path_or_url):
raise ValueError("Must provide exactly one image file path variable as input when using an Image Classifier.")

input_md = dict_to_md(input_data).strip()
hypothesis_template = "Based on this image, the question '" + question + "' should be answered with '{}'" \
if question else "This is an image of {}."

hypothesis_template = "In this example the question '" + question + "' should be answered with '{}'" \
if question else "This example is {}."
classification = await image_classifier.classify(
blob_or_path=image_path_or_url,
classes=possible_values,
hypothesis_template=hypothesis_template,
confidence_threshold=0.1,
)

classification = await classifier.classify(
input_md,
possible_values,
hypothesis_template=hypothesis_template,
confidence_threshold=0.1,
multi_label=multiple_decision_values
)
else: # text classifier
input_data = await ocr_documents(input_data, ocr)
input_data = await transcribe_audio(input_data, asr)
input_data = prepare_text_blobs(input_data)
assert_all_files_processed(input_data)

input_md = dict_to_md(input_data).strip()

hypothesis_template = "In this example the question '" + question + "' should be answered with '{}'" \
if question else "This example is {}."

classification = await classifier.classify(
input_md,
possible_values,
hypothesis_template=hypothesis_template,
confidence_threshold=0.1,
multi_label=multiple_decision_values
)

def raw_to_output_type(raw: str) -> Any:
if output_type == "boolean":
return (raw == 'yes') if raw else None
return (raw.lower() == 'yes' or raw.lower() == "true") if raw else None
elif output_type == "integer":
return int(raw) if raw else None
elif output_type == "number":
Expand Down
23 changes: 23 additions & 0 deletions bpm-ai/bpm_ai/extract/extract.prompt
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
[# system #]
You are a brilliant information extraction AI. You love to extract data and information from user-provided context and output correct and complete results.
{% if repeated %}
Extract and output a list of objects (as specified in the json schema in <schema></schema> tags below) contained in the following context by the user (in <context></context> tags). There may be one, multiple, or no objects present.
{% else %}
Extract and output the desired information (as specified in the json schema in <schema></schema> tags below) from the following context by the user (in <context></context> tags).
{% endif %}
<important>
Pay close attention to the type and description of each requested property. Just the name of a property may be too general or misleading in order to determine what to extract. Strictly follow the description of the property!
Only extract information that is completely contained in the context either directly or indirectly.
NEVER make anything up or assume information that isn't contained in or can't be deduced from the context.
If any property or information is not contained in or can't be deduced from the context, set the property to null. NEVER make up a value!
</important>

[# user #]
<context>
{{input | markdown}}
</context>

<reminder>
Set properties to null if they are not present in or can't be deduced from the context!
Output the extracted information as well-structured JSON following the given schema.
</reminder>
7 changes: 3 additions & 4 deletions bpm-ai/bpm_ai/extract/extract.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
import itertools
import re
from typing import Callable, Any
from typing import Any

from bpm_ai_core.classification.zero_shot_classifier import ZeroShotClassifier
from bpm_ai_core.llm.common.blob import Blob
from bpm_ai_core.question_answering.question_answering import QuestionAnswering
from bpm_ai_core.llm.common.llm import LLM
from bpm_ai_core.ocr.ocr import OCR
from bpm_ai_core.prompt.prompt import Prompt
from bpm_ai_core.speech_recognition.asr import ASRModel
from bpm_ai_core.text_classification.text_classifier import TextClassifier
from bpm_ai_core.token_classification.zero_shot_token_classifier import ZeroShotTokenClassifier
from bpm_ai_core.tracing.decorators import trace
from bpm_ai_core.util.file import is_supported_img_file
Expand Down Expand Up @@ -76,7 +75,7 @@ def empty_to_none(v):
@trace("bpm-ai-extract", ["extractive-qa"])
async def extract_qa(
qa: QuestionAnswering,
classifier: ZeroShotClassifier,
classifier: TextClassifier,
input_data: dict[str, str | dict | None],
output_schema: dict[str, str | dict],
multiple: bool = False,
Expand Down
17 changes: 17 additions & 0 deletions bpm-ai/bpm_ai/generic/generic.prompt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
[# system #]
You are a genius business AI that loves to make smart decisions, give correct results and follow instructions to the word.
Your task is to help users execute their business processes by performing a given task.

You will receive a task description (in <task_description></task_description> tags) and context information (in <context></context> tags) that may help with the task.
Perform the task based on the description and context information and output the result as a JSON object as specified in the json schema in <schema></schema> tags below.

[# user #]
<context>
{{context | markdown}}
</context>

<task_description>
{{task}}
</task_description>

Output your result as well-structured JSON following the given schema.
13 changes: 13 additions & 0 deletions bpm-ai/bpm_ai/translate/translate.prompt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[# system #]
You are a highly skilled translator with profound expertise in many languages and you are native in {{lang}}.

You will be given a JSON object with text in one or more languages.
Your task is to identify the language of the texts in the JSON object the user provides and accurately translate them into {{lang}} while preserving the meaning, tone, and nuance of the original text. Please maintain proper grammar, spelling, and punctuation in the translated version.
Output the translated texts as a JSON object as specified in the json schema in <schema></schema> tags below (same format as the input json).

<target_language>
{{lang}}
</target_language>

[# user #]
{{input}}
Loading

0 comments on commit e01050b

Please sign in to comment.