diff --git a/_freeze/reference/MallFrame/execute-results/html.json b/_freeze/reference/MallFrame/execute-results/html.json index 7d6f02a..cdf6513 100644 --- a/_freeze/reference/MallFrame/execute-results/html.json +++ b/_freeze/reference/MallFrame/execute-results/html.json @@ -1,8 +1,8 @@ { - "hash": "b719238e79aa68d0ccd5c863f83a82ef", + "hash": "53465116dcf582f40271e524ed3e926d", "result": { "engine": "jupyter", - "markdown": "---\ntitle: MallFrame\n---\n\n\n\n`MallFrame(self, df)`\n\nExtension to Polars that add ability to use\nan LLM to run batch predictions over a data frame\n\nWe will start by loading the needed libraries, and\nset up the data frame that will be used in the\nexamples:\n\n\n::: {#4c168564 .cell execution_count=1}\n``` {.python .cell-code}\nimport mall\nimport polars as pl\npl.Config(fmt_str_lengths=100)\npl.Config.set_tbl_hide_dataframe_shape(True)\npl.Config.set_tbl_hide_column_data_types(True)\ndata = mall.MallData\nreviews = data.reviews\nreviews.llm.use(options = dict(seed = 100))\n```\n:::\n\n\n## Methods\n\n| Name | Description |\n| --- | --- |\n| [classify](#mall.MallFrame.classify) | Classify text into specific categories. |\n| [custom](#mall.MallFrame.custom) | Provide the full prompt that the LLM will process. |\n| [extract](#mall.MallFrame.extract) | Pull a specific label from the text. |\n| [sentiment](#mall.MallFrame.sentiment) | Use an LLM to run a sentiment analysis |\n| [summarize](#mall.MallFrame.summarize) | Summarize the text down to a specific number of words. |\n| [translate](#mall.MallFrame.translate) | Translate text into another language. |\n| [use](#mall.MallFrame.use) | Define the model, backend, and other options to use to |\n| [verify](#mall.MallFrame.verify) | Check to see if something is true about the text. |\n\n### classify { #mall.MallFrame.classify }\n\n`MallFrame.classify(col, labels='', additional='', pred_name='classify')`\n\nClassify text into specific categories.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|--------------|--------|-------------------------------------------------------------------------------------------------------------------------|--------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `labels` | list | A list or a DICT object that defines the categories to classify the text as. It will return one of the provided labels. | `''` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'classify'` |\n| `additional` | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples\n\n::: {#814ab89b .cell execution_count=2}\n``` {.python .cell-code}\nreviews.llm.classify(\"review\", [\"appliance\", \"computer\"])\n```\n\n::: {.cell-output .cell-output-display execution_count=2}\n```{=html}\n
\n
reviewclassify
"This has been the best TV I've ever used. Great screen, and sound.""computer"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""computer"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""appliance"
\n```\n:::\n:::\n\n\n::: {#31c287e4 .cell execution_count=3}\n``` {.python .cell-code}\n# Use 'pred_name' to customize the new column's name\nreviews.llm.classify(\"review\", [\"appliance\", \"computer\"], pred_name=\"prod_type\")\n```\n\n::: {.cell-output .cell-output-display execution_count=3}\n```{=html}\n
\n
reviewprod_type
"This has been the best TV I've ever used. Great screen, and sound.""computer"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""computer"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""appliance"
\n```\n:::\n:::\n\n\n::: {#e9ba7273 .cell execution_count=4}\n``` {.python .cell-code}\n#Pass a DICT to set custom values for each classification\nreviews.llm.classify(\"review\", {\"appliance\" : \"1\", \"computer\" : \"2\"})\n```\n\n::: {.cell-output .cell-output-display execution_count=4}\n```{=html}\n
\n
reviewclassify
"This has been the best TV I've ever used. Great screen, and sound.""1"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""2"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""1"
\n```\n:::\n:::\n\n\n### custom { #mall.MallFrame.custom }\n\n`MallFrame.custom(col, prompt='', valid_resps='', pred_name='custom')`\n\nProvide the full prompt that the LLM will process.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|-------------|--------|----------------------------------------------------------------------------------------|------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `prompt` | str | The prompt to send to the LLM along with the `col` | `''` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'custom'` |\n\n#### Examples\n\n::: {#d421a385 .cell execution_count=5}\n``` {.python .cell-code}\nmy_prompt = (\n \"Answer a question.\"\n \"Return only the answer, no explanation\"\n \"Acceptable answers are 'yes', 'no'\"\n \"Answer this about the following text, is this a happy customer?:\"\n)\n\nreviews.llm.custom(\"review\", prompt = my_prompt)\n```\n\n::: {.cell-output .cell-output-display execution_count=5}\n```{=html}\n
\n
reviewcustom
"This has been the best TV I've ever used. Great screen, and sound.""Yes"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""No"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""No"
\n```\n:::\n:::\n\n\n### extract { #mall.MallFrame.extract }\n\n`MallFrame.extract(col, labels='', expand_cols=False, additional='', pred_name='extract')`\n\nPull a specific label from the text.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|--------------|--------|----------------------------------------------------------------------------------------|-------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `labels` | list | A list or a DICT object that defines tells the LLM what to look for and return | `''` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'extract'` |\n| `additional` | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples\n\n::: {#6a162f51 .cell execution_count=6}\n``` {.python .cell-code}\n# Use 'labels' to let the function know what to extract\nreviews.llm.extract(\"review\", labels = \"product\")\n```\n\n::: {.cell-output .cell-output-display execution_count=6}\n```{=html}\n
\n
reviewextract
"This has been the best TV I've ever used. Great screen, and sound.""tv"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""washing machine"
\n```\n:::\n:::\n\n\n::: {#3bfdee78 .cell execution_count=7}\n``` {.python .cell-code}\n# Use 'pred_name' to customize the new column's name\nreviews.llm.extract(\"review\", \"product\", pred_name = \"prod\")\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n```{=html}\n
\n
reviewprod
"This has been the best TV I've ever used. Great screen, and sound.""tv"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""washing machine"
\n```\n:::\n:::\n\n\n::: {#13591417 .cell execution_count=8}\n``` {.python .cell-code}\n# Pass a vector to request multiple things, the results will be pipe delimeted\n# in a single column\nreviews.llm.extract(\"review\", [\"product\", \"feelings\"])\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n```{=html}\n
\n
reviewextract
"This has been the best TV I've ever used. Great screen, and sound.""tv | great"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop|frustration"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""washing machine | confusion"
\n```\n:::\n:::\n\n\n::: {#4666e1b7 .cell execution_count=9}\n``` {.python .cell-code}\n# Set 'expand_cols' to True to split multiple lables\n# into individual columns\nreviews.llm.extract(\n col=\"review\",\n labels=[\"product\", \"feelings\"],\n expand_cols=True\n )\n```\n\n::: {.cell-output .cell-output-display execution_count=9}\n```{=html}\n
\n
reviewproductfeelings
"This has been the best TV I've ever used. Great screen, and sound.""tv "" great"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop""frustration"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""washing machine "" confusion"
\n```\n:::\n:::\n\n\n::: {#1e53a6c5 .cell execution_count=10}\n``` {.python .cell-code}\n# Set custom names to the resulting columns\nreviews.llm.extract(\n col=\"review\",\n labels={\"prod\": \"product\", \"feels\": \"feelings\"},\n expand_cols=True\n )\n```\n\n::: {.cell-output .cell-output-display execution_count=10}\n```{=html}\n
\n
reviewprodfeels
"This has been the best TV I've ever used. Great screen, and sound.""tv "" great"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop""frustration"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""washing machine "" confusion"
\n```\n:::\n:::\n\n\n### sentiment { #mall.MallFrame.sentiment }\n\n`MallFrame.sentiment(col, options=['positive', 'negative', 'neutral'], additional='', pred_name='sentiment')`\n\nUse an LLM to run a sentiment analysis\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|--------------|--------------|----------------------------------------------------------------------------------------|---------------------------------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `options` | list or dict | A list of the sentiment options to use, or a named DICT object | `['positive', 'negative', 'neutral']` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'sentiment'` |\n| `additional` | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples\n\n::: {#2c18bb5d .cell execution_count=11}\n``` {.python .cell-code}\nreviews.llm.sentiment(\"review\")\n```\n\n::: {.cell-output .cell-output-display execution_count=11}\n```{=html}\n
\n
reviewsentiment
"This has been the best TV I've ever used. Great screen, and sound.""positive"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""negative"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""neutral"
\n```\n:::\n:::\n\n\n::: {#8b7ccef7 .cell execution_count=12}\n``` {.python .cell-code}\n# Use 'pred_name' to customize the new column's name\nreviews.llm.sentiment(\"review\", pred_name=\"review_sentiment\")\n```\n\n::: {.cell-output .cell-output-display execution_count=12}\n```{=html}\n
\n
reviewreview_sentiment
"This has been the best TV I've ever used. Great screen, and sound.""positive"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""negative"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""neutral"
\n```\n:::\n:::\n\n\n::: {#adf9e06d .cell execution_count=13}\n``` {.python .cell-code}\n# Pass custom sentiment options\nreviews.llm.sentiment(\"review\", [\"positive\", \"negative\"])\n```\n\n::: {.cell-output .cell-output-display execution_count=13}\n```{=html}\n
\n
reviewsentiment
"This has been the best TV I've ever used. Great screen, and sound.""positive"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""negative"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""negative"
\n```\n:::\n:::\n\n\n::: {#2f6ce57c .cell execution_count=14}\n``` {.python .cell-code}\n# Use a DICT object to specify values to return per sentiment\nreviews.llm.sentiment(\"review\", {\"positive\" : 1, \"negative\" : 0})\n```\n\n::: {.cell-output .cell-output-display execution_count=14}\n```{=html}\n
\n
reviewsentiment
"This has been the best TV I've ever used. Great screen, and sound."1
"I regret buying this laptop. It is too slow and the keyboard is too noisy"0
"Not sure how to feel about my new washing machine. Great color, but hard to figure"0
\n```\n:::\n:::\n\n\n### summarize { #mall.MallFrame.summarize }\n\n`MallFrame.summarize(col, max_words=10, additional='', pred_name='summary')`\n\nSummarize the text down to a specific number of words.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|--------------|--------|----------------------------------------------------------------------------------------|-------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `max_words` | int | Maximum number of words to use for the summary | `10` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'summary'` |\n| `additional` | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples\n\n::: {#d2c856a7 .cell execution_count=15}\n``` {.python .cell-code}\n# Use max_words to set the maximum number of words to use for the summary\nreviews.llm.summarize(\"review\", max_words = 5)\n```\n\n::: {.cell-output .cell-output-display execution_count=15}\n```{=html}\n
\n
reviewsummary
"This has been the best TV I've ever used. Great screen, and sound.""great tv with good features"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop purchase was a mistake"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""feeling uncertain about new purchase"
\n```\n:::\n:::\n\n\n::: {#5b0affc2 .cell execution_count=16}\n``` {.python .cell-code}\n# Use 'pred_name' to customize the new column's name\nreviews.llm.summarize(\"review\", 5, pred_name = \"review_summary\")\n```\n\n::: {.cell-output .cell-output-display execution_count=16}\n```{=html}\n
\n
reviewreview_summary
"This has been the best TV I've ever used. Great screen, and sound.""great tv with good features"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop purchase was a mistake"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""feeling uncertain about new purchase"
\n```\n:::\n:::\n\n\n### translate { #mall.MallFrame.translate }\n\n`MallFrame.translate(col, language='', additional='', pred_name='translation')`\n\nTranslate text into another language.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|--------------|--------|----------------------------------------------------------------------------------------|-----------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `language` | str | The target language to translate to. For example 'French'. | `''` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'translation'` |\n| `additional` | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples\n\n::: {#60dd9231 .cell execution_count=17}\n``` {.python .cell-code}\nreviews.llm.translate(\"review\", \"spanish\")\n```\n\n::: {.cell-output .cell-output-display execution_count=17}\n```{=html}\n
\n
reviewtranslation
"This has been the best TV I've ever used. Great screen, and sound.""Esta ha sido la mejor televisión que he utilizado hasta ahora. Gran pantalla y sonido."
"I regret buying this laptop. It is too slow and the keyboard is too noisy""Me arrepiento de comprar este portátil. Es demasiado lento y la tecla es demasiado ruidosa."
"Not sure how to feel about my new washing machine. Great color, but hard to figure""No estoy seguro de cómo sentirme con mi nueva lavadora. Un color maravilloso, pero muy difícil de en…
\n```\n:::\n:::\n\n\n::: {#68f88b5c .cell execution_count=18}\n``` {.python .cell-code}\nreviews.llm.translate(\"review\", \"french\")\n```\n\n::: {.cell-output .cell-output-display execution_count=18}\n```{=html}\n
\n
reviewtranslation
"This has been the best TV I've ever used. Great screen, and sound.""Ceci était la meilleure télévision que j'ai jamais utilisée. Écran et son excellent."
"I regret buying this laptop. It is too slow and the keyboard is too noisy""Je me regrette d'avoir acheté ce portable. Il est trop lent et le clavier fait trop de bruit."
"Not sure how to feel about my new washing machine. Great color, but hard to figure""Je ne sais pas comment réagir à mon nouveau lave-linge. Couleur superbe, mais difficile à comprendre…
\n```\n:::\n:::\n\n\n### use { #mall.MallFrame.use }\n\n`MallFrame.use(backend='', model='', _cache='_mall_cache', **kwargs)`\n\nDefine the model, backend, and other options to use to\ninteract with the LLM.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|------------|--------|--------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|\n| `backend` | str | The name of the backend to use. At the beginning of the session it defaults to \"ollama\". If passing `\"\"`, it will remain unchanged | `''` |\n| `model` | str | The name of the model tha the backend should use. At the beginning of the session it defaults to \"llama3.2\". If passing `\"\"`, it will remain unchanged | `''` |\n| `_cache` | str | The path of where to save the cached results. Passing `\"\"` disables the cache | `'_mall_cache'` |\n| `**kwargs` | | Arguments to pass to the downstream Python call. In this case, the `chat` function in `ollama` | `{}` |\n\n#### Examples\n\n::: {#23b4ad40 .cell execution_count=19}\n``` {.python .cell-code}\n# Additional arguments will be passed 'as-is' to the\n# downstream R function in this example, to ollama::chat()\nreviews.llm.use(\"ollama\", \"llama3.2\", seed = 100, temp = 0.1)\n```\n\n::: {.cell-output .cell-output-display execution_count=19}\n```\n{'backend': 'ollama',\n 'model': 'llama3.2',\n '_cache': '_mall_cache',\n 'options': {'seed': 100},\n 'seed': 100,\n 'temp': 0.1}\n```\n:::\n:::\n\n\n::: {#f6d41e13 .cell execution_count=20}\n``` {.python .cell-code}\n# During the Python session, you can change any argument\n# individually and it will retain all of previous\n# arguments used\nreviews.llm.use(temp = 0.3)\n```\n\n::: {.cell-output .cell-output-display execution_count=20}\n```\n{'backend': 'ollama',\n 'model': 'llama3.2',\n '_cache': '_mall_cache',\n 'options': {'seed': 100},\n 'seed': 100,\n 'temp': 0.3}\n```\n:::\n:::\n\n\n::: {#0e6291ad .cell execution_count=21}\n``` {.python .cell-code}\n# Use _cache to modify the target folder for caching\nreviews.llm.use(_cache = \"_my_cache\")\n```\n\n::: {.cell-output .cell-output-display execution_count=21}\n```\n{'backend': 'ollama',\n 'model': 'llama3.2',\n '_cache': '_my_cache',\n 'options': {'seed': 100},\n 'seed': 100,\n 'temp': 0.3}\n```\n:::\n:::\n\n\n::: {#8e49f143 .cell execution_count=22}\n``` {.python .cell-code}\n# Leave _cache empty to turn off this functionality\nreviews.llm.use(_cache = \"\")\n```\n\n::: {.cell-output .cell-output-display execution_count=22}\n```\n{'backend': 'ollama',\n 'model': 'llama3.2',\n '_cache': '',\n 'options': {'seed': 100},\n 'seed': 100,\n 'temp': 0.3}\n```\n:::\n:::\n\n\n### verify { #mall.MallFrame.verify }\n\n`MallFrame.verify(col, what='', yes_no=[1, 0], additional='', pred_name='verify')`\n\nCheck to see if something is true about the text.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|--------------|--------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `what` | str | The statement or question that needs to be verified against the provided text | `''` |\n| `yes_no` | list | A positional list of size 2, which contains the values to return if true and false. The first position will be used as the 'true' value, and the second as the 'false' value | `[1, 0]` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'verify'` |\n| `additional` | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples\n\n::: {#acd990e9 .cell execution_count=23}\n``` {.python .cell-code}\nreviews.llm.verify(\"review\", \"is the customer happy\")\n```\n\n::: {.cell-output .cell-output-display execution_count=23}\n```{=html}\n
\n
reviewverify
"This has been the best TV I've ever used. Great screen, and sound."1
"I regret buying this laptop. It is too slow and the keyboard is too noisy"0
"Not sure how to feel about my new washing machine. Great color, but hard to figure"0
\n```\n:::\n:::\n\n\n::: {#a6cae551 .cell execution_count=24}\n``` {.python .cell-code}\n# Use 'yes_no' to modify the 'true' and 'false' values to return\nreviews.llm.verify(\"review\", \"is the customer happy\", [\"y\", \"n\"])\n```\n\n::: {.cell-output .cell-output-display execution_count=24}\n```{=html}\n
\n
reviewverify
"This has been the best TV I've ever used. Great screen, and sound.""y"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""n"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""n"
\n```\n:::\n:::\n\n\n", + "markdown": "---\ntitle: MallFrame\n---\n\n\n\n`MallFrame(self, df)`\n\nExtension to Polars that add ability to use\nan LLM to run batch predictions over a data frame\n\nWe will start by loading the needed libraries, and\nset up the data frame that will be used in the\nexamples:\n\n\n::: {#68501681 .cell execution_count=1}\n``` {.python .cell-code}\nimport mall\nimport polars as pl\npl.Config(fmt_str_lengths=100)\npl.Config.set_tbl_hide_dataframe_shape(True)\npl.Config.set_tbl_hide_column_data_types(True)\ndata = mall.MallData\nreviews = data.reviews\nreviews.llm.use(options = dict(seed = 100))\n```\n:::\n\n\n## Methods\n\n| Name | Description |\n| --- | --- |\n| [classify](#mall.MallFrame.classify) | Classify text into specific categories. |\n| [custom](#mall.MallFrame.custom) | Provide the full prompt that the LLM will process. |\n| [extract](#mall.MallFrame.extract) | Pull a specific label from the text. |\n| [sentiment](#mall.MallFrame.sentiment) | Use an LLM to run a sentiment analysis |\n| [summarize](#mall.MallFrame.summarize) | Summarize the text down to a specific number of words. |\n| [translate](#mall.MallFrame.translate) | Translate text into another language. |\n| [use](#mall.MallFrame.use) | Define the model, backend, and other options to use to |\n| [verify](#mall.MallFrame.verify) | Check to see if something is true about the text. |\n\n### classify { #mall.MallFrame.classify }\n\n`MallFrame.classify(col, labels='', additional='', pred_name='classify')`\n\nClassify text into specific categories.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|--------------|--------|-------------------------------------------------------------------------------------------------------------------------|--------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `labels` | list | A list or a DICT object that defines the categories to classify the text as. It will return one of the provided labels. | `''` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'classify'` |\n| `additional` | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples\n\n::: {#05b300cb .cell execution_count=2}\n``` {.python .cell-code}\nreviews.llm.classify(\"review\", [\"appliance\", \"computer\"])\n```\n\n::: {.cell-output .cell-output-display execution_count=2}\n```{=html}\n
\n
reviewclassify
"This has been the best TV I've ever used. Great screen, and sound.""computer"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""computer"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""appliance"
\n```\n:::\n:::\n\n\n::: {#72994148 .cell execution_count=3}\n``` {.python .cell-code}\n# Use 'pred_name' to customize the new column's name\nreviews.llm.classify(\"review\", [\"appliance\", \"computer\"], pred_name=\"prod_type\")\n```\n\n::: {.cell-output .cell-output-display execution_count=3}\n```{=html}\n
\n
reviewprod_type
"This has been the best TV I've ever used. Great screen, and sound.""computer"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""computer"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""appliance"
\n```\n:::\n:::\n\n\n::: {#02cd9168 .cell execution_count=4}\n``` {.python .cell-code}\n#Pass a DICT to set custom values for each classification\nreviews.llm.classify(\"review\", {\"appliance\" : \"1\", \"computer\" : \"2\"})\n```\n\n::: {.cell-output .cell-output-display execution_count=4}\n```{=html}\n
\n
reviewclassify
"This has been the best TV I've ever used. Great screen, and sound.""1"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""2"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""1"
\n```\n:::\n:::\n\n\n### custom { #mall.MallFrame.custom }\n\n`MallFrame.custom(col, prompt='', valid_resps='', pred_name='custom')`\n\nProvide the full prompt that the LLM will process.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|-------------|--------|----------------------------------------------------------------------------------------|------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `prompt` | str | The prompt to send to the LLM along with the `col` | `''` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'custom'` |\n\n#### Examples\n\n::: {#1fbb9c3a .cell execution_count=5}\n``` {.python .cell-code}\nmy_prompt = (\n \"Answer a question.\"\n \"Return only the answer, no explanation\"\n \"Acceptable answers are 'yes', 'no'\"\n \"Answer this about the following text, is this a happy customer?:\"\n)\n\nreviews.llm.custom(\"review\", prompt = my_prompt)\n```\n\n::: {.cell-output .cell-output-display execution_count=5}\n```{=html}\n
\n
reviewcustom
"This has been the best TV I've ever used. Great screen, and sound.""Yes"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""No"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""No"
\n```\n:::\n:::\n\n\n### extract { #mall.MallFrame.extract }\n\n`MallFrame.extract(col, labels='', expand_cols=False, additional='', pred_name='extract')`\n\nPull a specific label from the text.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|--------------|--------|----------------------------------------------------------------------------------------|-------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `labels` | list | A list or a DICT object that defines tells the LLM what to look for and return | `''` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'extract'` |\n| `additional` | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples\n\n::: {#9e88f97a .cell execution_count=6}\n``` {.python .cell-code}\n# Use 'labels' to let the function know what to extract\nreviews.llm.extract(\"review\", labels = \"product\")\n```\n\n::: {.cell-output .cell-output-display execution_count=6}\n```{=html}\n
\n
reviewextract
"This has been the best TV I've ever used. Great screen, and sound.""tv"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""washing machine"
\n```\n:::\n:::\n\n\n::: {#b02b41dc .cell execution_count=7}\n``` {.python .cell-code}\n# Use 'pred_name' to customize the new column's name\nreviews.llm.extract(\"review\", \"product\", pred_name = \"prod\")\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n```{=html}\n
\n
reviewprod
"This has been the best TV I've ever used. Great screen, and sound.""tv"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""washing machine"
\n```\n:::\n:::\n\n\n::: {#e7fe5a38 .cell execution_count=8}\n``` {.python .cell-code}\n# Pass a vector to request multiple things, the results will be pipe delimeted\n# in a single column\nreviews.llm.extract(\"review\", [\"product\", \"feelings\"])\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n```{=html}\n
\n
reviewextract
"This has been the best TV I've ever used. Great screen, and sound.""tv | great"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop|frustration"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""washing machine | confusion"
\n```\n:::\n:::\n\n\n::: {#e6ccd8fa .cell execution_count=9}\n``` {.python .cell-code}\n# Set 'expand_cols' to True to split multiple lables\n# into individual columns\nreviews.llm.extract(\n col=\"review\",\n labels=[\"product\", \"feelings\"],\n expand_cols=True\n )\n```\n\n::: {.cell-output .cell-output-display execution_count=9}\n```{=html}\n
\n
reviewproductfeelings
"This has been the best TV I've ever used. Great screen, and sound.""tv "" great"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop""frustration"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""washing machine "" confusion"
\n```\n:::\n:::\n\n\n::: {#1263a5bf .cell execution_count=10}\n``` {.python .cell-code}\n# Set custom names to the resulting columns\nreviews.llm.extract(\n col=\"review\",\n labels={\"prod\": \"product\", \"feels\": \"feelings\"},\n expand_cols=True\n )\n```\n\n::: {.cell-output .cell-output-display execution_count=10}\n```{=html}\n
\n
reviewprodfeels
"This has been the best TV I've ever used. Great screen, and sound.""tv "" great"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop""frustration"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""washing machine "" confusion"
\n```\n:::\n:::\n\n\n### sentiment { #mall.MallFrame.sentiment }\n\n`MallFrame.sentiment(col, options=['positive', 'negative', 'neutral'], additional='', pred_name='sentiment')`\n\nUse an LLM to run a sentiment analysis\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|--------------|--------------|----------------------------------------------------------------------------------------|---------------------------------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `options` | list or dict | A list of the sentiment options to use, or a named DICT object | `['positive', 'negative', 'neutral']` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'sentiment'` |\n| `additional` | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples\n\n::: {#7c951e33 .cell execution_count=11}\n``` {.python .cell-code}\nreviews.llm.sentiment(\"review\")\n```\n\n::: {.cell-output .cell-output-display execution_count=11}\n```{=html}\n
\n
reviewsentiment
"This has been the best TV I've ever used. Great screen, and sound.""positive"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""negative"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""neutral"
\n```\n:::\n:::\n\n\n::: {#3aadb9e4 .cell execution_count=12}\n``` {.python .cell-code}\n# Use 'pred_name' to customize the new column's name\nreviews.llm.sentiment(\"review\", pred_name=\"review_sentiment\")\n```\n\n::: {.cell-output .cell-output-display execution_count=12}\n```{=html}\n
\n
reviewreview_sentiment
"This has been the best TV I've ever used. Great screen, and sound.""positive"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""negative"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""neutral"
\n```\n:::\n:::\n\n\n::: {#19e9ad37 .cell execution_count=13}\n``` {.python .cell-code}\n# Pass custom sentiment options\nreviews.llm.sentiment(\"review\", [\"positive\", \"negative\"])\n```\n\n::: {.cell-output .cell-output-display execution_count=13}\n```{=html}\n
\n
reviewsentiment
"This has been the best TV I've ever used. Great screen, and sound.""positive"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""negative"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""negative"
\n```\n:::\n:::\n\n\n::: {#ebd970bb .cell execution_count=14}\n``` {.python .cell-code}\n# Use a DICT object to specify values to return per sentiment\nreviews.llm.sentiment(\"review\", {\"positive\" : 1, \"negative\" : 0})\n```\n\n::: {.cell-output .cell-output-display execution_count=14}\n```{=html}\n
\n
reviewsentiment
"This has been the best TV I've ever used. Great screen, and sound."1
"I regret buying this laptop. It is too slow and the keyboard is too noisy"0
"Not sure how to feel about my new washing machine. Great color, but hard to figure"0
\n```\n:::\n:::\n\n\n### summarize { #mall.MallFrame.summarize }\n\n`MallFrame.summarize(col, max_words=10, additional='', pred_name='summary')`\n\nSummarize the text down to a specific number of words.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|--------------|--------|----------------------------------------------------------------------------------------|-------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `max_words` | int | Maximum number of words to use for the summary | `10` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'summary'` |\n| `additional` | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples\n\n::: {#7c48fa25 .cell execution_count=15}\n``` {.python .cell-code}\n# Use max_words to set the maximum number of words to use for the summary\nreviews.llm.summarize(\"review\", max_words = 5)\n```\n\n::: {.cell-output .cell-output-display execution_count=15}\n```{=html}\n
\n
reviewsummary
"This has been the best TV I've ever used. Great screen, and sound.""great tv with good features"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop purchase was a mistake"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""feeling uncertain about new purchase"
\n```\n:::\n:::\n\n\n::: {#efffcd2e .cell execution_count=16}\n``` {.python .cell-code}\n# Use 'pred_name' to customize the new column's name\nreviews.llm.summarize(\"review\", 5, pred_name = \"review_summary\")\n```\n\n::: {.cell-output .cell-output-display execution_count=16}\n```{=html}\n
\n
reviewreview_summary
"This has been the best TV I've ever used. Great screen, and sound.""great tv with good features"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""laptop purchase was a mistake"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""feeling uncertain about new purchase"
\n```\n:::\n:::\n\n\n### translate { #mall.MallFrame.translate }\n\n`MallFrame.translate(col, language='', additional='', pred_name='translation')`\n\nTranslate text into another language.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|--------------|--------|----------------------------------------------------------------------------------------|-----------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `language` | str | The target language to translate to. For example 'French'. | `''` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'translation'` |\n| `additional` | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples\n\n::: {#216a04c1 .cell execution_count=17}\n``` {.python .cell-code}\nreviews.llm.translate(\"review\", \"spanish\")\n```\n\n::: {.cell-output .cell-output-display execution_count=17}\n```{=html}\n
\n
reviewtranslation
"This has been the best TV I've ever used. Great screen, and sound.""Esta ha sido la mejor televisión que he utilizado hasta ahora. Gran pantalla y sonido."
"I regret buying this laptop. It is too slow and the keyboard is too noisy""Me arrepiento de comprar este portátil. Es demasiado lento y la tecla es demasiado ruidosa."
"Not sure how to feel about my new washing machine. Great color, but hard to figure""No estoy seguro de cómo sentirme con mi nueva lavadora. Un color maravilloso, pero muy difícil de en…
\n```\n:::\n:::\n\n\n::: {#1a466ee5 .cell execution_count=18}\n``` {.python .cell-code}\nreviews.llm.translate(\"review\", \"french\")\n```\n\n::: {.cell-output .cell-output-display execution_count=18}\n```{=html}\n
\n
reviewtranslation
"This has been the best TV I've ever used. Great screen, and sound.""Ceci était la meilleure télévision que j'ai jamais utilisée. Écran et son excellent."
"I regret buying this laptop. It is too slow and the keyboard is too noisy""Je me regrette d'avoir acheté ce portable. Il est trop lent et le clavier fait trop de bruit."
"Not sure how to feel about my new washing machine. Great color, but hard to figure""Je ne sais pas comment réagir à mon nouveau lave-linge. Couleur superbe, mais difficile à comprendre…
\n```\n:::\n:::\n\n\n### use { #mall.MallFrame.use }\n\n`MallFrame.use(backend='', model='', _cache='_mall_cache', **kwargs)`\n\nDefine the model, backend, and other options to use to\ninteract with the LLM.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|------------|--------|--------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|\n| `backend` | str | The name of the backend to use. At the beginning of the session it defaults to \"ollama\". If passing `\"\"`, it will remain unchanged | `''` |\n| `model` | str | The name of the model tha the backend should use. At the beginning of the session it defaults to \"llama3.2\". If passing `\"\"`, it will remain unchanged | `''` |\n| `_cache` | str | The path of where to save the cached results. Passing `\"\"` disables the cache | `'_mall_cache'` |\n| `**kwargs` | | Arguments to pass to the downstream Python call. In this case, the `chat` function in `ollama` | `{}` |\n\n#### Examples\n\n::: {#661cc064 .cell execution_count=19}\n``` {.python .cell-code}\n# Additional arguments will be passed 'as-is' to the\n# downstream R function in this example, to ollama::chat()\nreviews.llm.use(\"ollama\", \"llama3.2\", options = dict(seed = 100, temperature = 0.1))\n```\n\n::: {.cell-output .cell-output-display execution_count=19}\n```\n{'backend': 'ollama',\n 'model': 'llama3.2',\n '_cache': '_mall_cache',\n 'options': {'seed': 100, 'temperature': 0.1}}\n```\n:::\n:::\n\n\n::: {#465f0062 .cell execution_count=20}\n``` {.python .cell-code}\n# During the Python session, you can change any argument\n# individually and it will retain all of previous\n# arguments used\nreviews.llm.use(options = dict(temperature = 0.3))\n```\n\n::: {.cell-output .cell-output-display execution_count=20}\n```\n{'backend': 'ollama',\n 'model': 'llama3.2',\n '_cache': '_mall_cache',\n 'options': {'temperature': 0.3}}\n```\n:::\n:::\n\n\n::: {#bbbff857 .cell execution_count=21}\n``` {.python .cell-code}\n# Use _cache to modify the target folder for caching\nreviews.llm.use(_cache = \"_my_cache\")\n```\n\n::: {.cell-output .cell-output-display execution_count=21}\n```\n{'backend': 'ollama',\n 'model': 'llama3.2',\n '_cache': '_my_cache',\n 'options': {'temperature': 0.3}}\n```\n:::\n:::\n\n\n::: {#dd3af28d .cell execution_count=22}\n``` {.python .cell-code}\n# Leave _cache empty to turn off this functionality\nreviews.llm.use(_cache = \"\")\n```\n\n::: {.cell-output .cell-output-display execution_count=22}\n```\n{'backend': 'ollama',\n 'model': 'llama3.2',\n '_cache': '',\n 'options': {'temperature': 0.3}}\n```\n:::\n:::\n\n\n### verify { #mall.MallFrame.verify }\n\n`MallFrame.verify(col, what='', yes_no=[1, 0], additional='', pred_name='verify')`\n\nCheck to see if something is true about the text.\n\n#### Parameters\n\n| Name | Type | Description | Default |\n|--------------|--------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|\n| `col` | str | The name of the text field to process | _required_ |\n| `what` | str | The statement or question that needs to be verified against the provided text | `''` |\n| `yes_no` | list | A positional list of size 2, which contains the values to return if true and false. The first position will be used as the 'true' value, and the second as the 'false' value | `[1, 0]` |\n| `pred_name` | str | A character vector with the name of the new column where the prediction will be placed | `'verify'` |\n| `additional` | str | Inserts this text into the prompt sent to the LLM | `''` |\n\n#### Examples\n\n::: {#0d259bf3 .cell execution_count=23}\n``` {.python .cell-code}\nreviews.llm.verify(\"review\", \"is the customer happy\")\n```\n\n::: {.cell-output .cell-output-display execution_count=23}\n```{=html}\n
\n
reviewverify
"This has been the best TV I've ever used. Great screen, and sound."1
"I regret buying this laptop. It is too slow and the keyboard is too noisy"0
"Not sure how to feel about my new washing machine. Great color, but hard to figure"0
\n```\n:::\n:::\n\n\n::: {#064ae124 .cell execution_count=24}\n``` {.python .cell-code}\n# Use 'yes_no' to modify the 'true' and 'false' values to return\nreviews.llm.verify(\"review\", \"is the customer happy\", [\"y\", \"n\"])\n```\n\n::: {.cell-output .cell-output-display execution_count=24}\n```{=html}\n
\n
reviewverify
"This has been the best TV I've ever used. Great screen, and sound.""y"
"I regret buying this laptop. It is too slow and the keyboard is too noisy""n"
"Not sure how to feel about my new washing machine. Great color, but hard to figure""n"
\n```\n:::\n:::\n\n\n", "supporting": [ "MallFrame_files" ], diff --git a/python/mall/polars.py b/python/mall/polars.py index 9f80b68..447e397 100644 --- a/python/mall/polars.py +++ b/python/mall/polars.py @@ -64,14 +64,14 @@ def use(self, backend="", model="", _cache="_mall_cache", **kwargs): ```{python} # Additional arguments will be passed 'as-is' to the # downstream R function in this example, to ollama::chat() - reviews.llm.use("ollama", "llama3.2", seed = 100, temperature = 0.1) + reviews.llm.use("ollama", "llama3.2", options = dict(seed = 100, temperature = 0.1)) ``` ```{python} # During the Python session, you can change any argument # individually and it will retain all of previous # arguments used - reviews.llm.use(temperature = 0.3) + reviews.llm.use(options = dict(temperature = 0.3)) ``` ```{python} diff --git a/reference/MallFrame.qmd b/reference/MallFrame.qmd index 2da1411..3b02db3 100644 --- a/reference/MallFrame.qmd +++ b/reference/MallFrame.qmd @@ -253,14 +253,14 @@ interact with the LLM. ```{python} # Additional arguments will be passed 'as-is' to the # downstream R function in this example, to ollama::chat() -reviews.llm.use("ollama", "llama3.2", seed = 100, temp = 0.1) +reviews.llm.use("ollama", "llama3.2", options = dict(seed = 100, temperature = 0.1)) ``` ```{python} # During the Python session, you can change any argument # individually and it will retain all of previous # arguments used -reviews.llm.use(temp = 0.3) +reviews.llm.use(options = dict(temperature = 0.3)) ``` ```{python} diff --git a/site/README.md b/site/README.md index 916f840..f711a45 100644 --- a/site/README.md +++ b/site/README.md @@ -2,6 +2,7 @@ To re-create the reference files, and capture the possibly new output from the resulting Quarto files, use the following steps: ```bash +uv pip install python/ rm -rf _freeze/reference R -e 'pkgsite::write_reference()' quartodoc build --verbose