fix(helpers): cost of prompt and completion tokens for OpenAI models #618

mspronesti · 2023-10-04T22:02:14Z

Hi @gventuri,
this PR aims at separating the cost evaluation of prompt tokens and completion/chat completion tokens, which are charged differently.

Summary by CodeRabbit

New Feature: Added support for GPT-4 and GPT-3.5 models in the OpenAI helper.
Refactor: Updated get_openai_token_cost_for_model function to distinguish between prompt tokens and completion tokens, improving cost calculation accuracy.
New Feature: Enhanced the __call__ method to separately calculate costs for prompt and completion tokens, providing more granular control over token usage.

coderabbitai · 2023-10-04T22:02:17Z

Walkthrough

The changes introduce new model names for GPT-4 and GPT-3.5 in the openai_info.py file, with a focus on distinguishing between prompt tokens and completion tokens. The cost calculation for each token type has been separated, enhancing the granularity of the process.

Changes

File Path	Summary
.../helpers/openai_info.py	Introduced GPT-4 and GPT-3.5 models, distinguished between prompt and completion tokens, and modified cost calculation methods to handle these tokens separately.

🐇💻

In the land of code where the shadows lie,

New models emerge under the AI sky.

Tokens prompt, and tokens complete,

Each with a cost, none obsolete.

Celebrate the change, for it's no small feat,

In the world of AI, we never admit defeat! 🎉🥕

Tips

Chat with CodeRabbit Bot (`@coderabbitai`)

Mention @coderabbitai in any review comment for bot assistance.
Note: Review comments are made on code diffs or files, not on the PR overview.

Pause Incremental Reviews

Insert @coderabbitai: ignore in the PR description to halt the bot's ongoing reviews. Remove the line to resume.
Additionally, enter @coderabbitai pause to halt the bot's ongoing reviews as any issue or review comment. To
resume reviews, enter @coderabbitai resume as any issue or review comment.

coderabbitai

Review Status

Actionable comments generated: 0

Commits

Files that changed from the base of the PR and between a19cfe6 and ebb91e3.

Files selected for processing (1)

pandasai/helpers/openai_info.py (2 hunks)

Additional comments (Suppressed): 3

pandasai/helpers/openai_info.py (3)

10-44: The MODEL_COST_PER_1K_TOKENS dictionary has been updated to include new model names for GPT-4, GPT-3.5, and Azure GPT-35, along with their respective costs per 1000 tokens. The cost values are now differentiated between input and output (completion) tokens for each model. Ensure that these cost values are accurate and up-to-date.

47-75: The function get_openai_token_cost_for_model() has been modified to handle completion tokens separately. A new argument is_completion has been added to the function signature. If is_completion is True and the model name starts with "gpt-4", "gpt-3.5", or "gpt-35", the function appends "-completion" to the model name before looking up the cost in the MODEL_COST_PER_1K_TOKENS dictionary. This change allows the function to calculate the cost of prompt tokens and completion tokens independently, enhancing the accuracy of token cost calculation for different models and operations. Make sure all calls to this function throughout the codebase have been updated to match the new signature.

99-112: In the __call__ method of the OpenAICallbackHandler class, the total cost calculation has been updated to separately account for the cost of prompt tokens and completion tokens. The get_openai_token_cost_for_model() function is called twice, once for prompt tokens and once for completion tokens, with the is_completion argument set to True for the latter. The costs are then added together to get the total cost. This change improves the precision of the cost tracking mechanism by treating prompt tokens and completion tokens as distinct entities. Ensure that this modification does not introduce any unexpected side effects in the overall cost calculation process.

coderabbitai

Review Status

Actionable comments generated: 0

Commits

Files that changed from the base of the PR and between ebb91e3 and 065ba14.

Files selected for processing (1)

pandasai/helpers/openai_info.py (2 hunks)

Additional comments (Suppressed): 3

pandasai/helpers/openai_info.py (3)
10-44: The MODEL_COST_PER_1K_TOKENS dictionary has been updated to include new model names for GPT-4, GPT-3.5, and Azure GPT-35. The cost per 1000 tokens is now differentiated between input (prompt) and output (completion) tokens. Ensure that these costs are accurate and up-to-date with OpenAI's pricing.

47-75: The function get_openai_token_cost_for_model() has been modified to handle prompt and completion tokens separately. An additional parameter is_completion has been introduced to indicate whether the tokens are for completion or not. This change allows for more accurate cost calculation based on token type. However, ensure that all calls to this function throughout the codebase have been updated to match the new signature.
- def get_openai_token_cost_for_model(
-    model_name: str,
-    num_tokens: int,
- ) -> float:
+ def get_openai_token_cost_for_model(
+        model_name: str,
+        num_tokens: int,
+        is_completion: bool = False,
+ ) -> float:
99-114: The __call__ method in the OpenAICallbackHandler class has been updated to calculate the cost of prompt tokens and completion tokens independently. This separation of cost calculation enhances the accuracy of the system's token cost evaluation. Ensure that the response.usage object contains both prompt_tokens and completion_tokens fields.
- total_cost = get_openai_token_cost_for_model(model_name, usage.total_tokens)
- self.total_cost += total_cost
+ prompt_cost = get_openai_token_cost_for_model(
+     model_name, usage.prompt_tokens
+ )
+ completion_cost = get_openai_token_cost_for_model(
+     model_name, usage.completion_tokens, is_completion=True
+ )
+ self.total_cost += prompt_cost + completion_cost

coderabbitai

Review Status

Actionable comments generated: 0

Commits

Files that changed from the base of the PR and between ebb91e3 and f1bf92d.

Files selected for processing (1)

pandasai/helpers/openai_info.py (2 hunks)

Additional comments (Suppressed): 3

pandasai/helpers/openai_info.py (3)
10-44: The MODEL_COST_PER_1K_TOKENS dictionary has been updated to include new model names for GPT-4, GPT-3.5, and Azure GPT-35. The cost per 1000 tokens is now differentiated between input and output (completion) for each model. Ensure that these costs are accurate and up-to-date with OpenAI's pricing.

47-75: The function get_openai_token_cost_for_model() has been modified to include an additional parameter is_completion. This parameter is used to differentiate between prompt and completion tokens when calculating the cost. If is_completion is True, "-completion" is appended to the model name to fetch the correct cost from the MODEL_COST_PER_1K_TOKENS dictionary. This change allows for more precise cost calculation by treating prompt and completion tokens separately.
- def get_openai_token_cost_for_model(
-    model_name: str,
-    num_tokens: int,
- ) -> float:
+ def get_openai_token_cost_for_model(
+        model_name: str,
+        num_tokens: int,
+        is_completion: bool = False,
+ ) -> float:
99-114: The __call__() method of OpenAICallbackHandler has been updated to calculate the cost of prompt tokens and completion tokens separately. It uses the get_openai_token_cost_for_model() function with the is_completion parameter set accordingly. This change aligns with the modification in the get_openai_token_cost_for_model() function and ensures a more accurate total cost calculation.
- total_cost = get_openai_token_cost_for_model(model_name, usage.total_tokens)
- self.total_cost += total_cost
+ prompt_cost = get_openai_token_cost_for_model(
+     model_name, usage.prompt_tokens
+ )
+ completion_cost = get_openai_token_cost_for_model(
+     model_name, usage.completion_tokens, is_completion=True
+ )
+ self.total_cost += prompt_cost + completion_cost

codecov-commenter · 2023-10-04T23:15:42Z

Codecov Report

Merging #618 (f1bf92d) into main (a19cfe6) will increase coverage by 0.01%.
The diff coverage is 100.00%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@            Coverage Diff             @@
##             main     #618      +/-   ##
==========================================
+ Coverage   83.53%   83.54%   +0.01%     
==========================================
  Files          55       55              
  Lines        2696     2699       +3     
==========================================
+ Hits         2252     2255       +3     
  Misses        444      444

Files	Coverage Δ
pandasai/helpers/openai_info.py	`80.00% <100.00%> (+1.62%)`	⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

mspronesti · 2023-10-05T12:21:43Z

@gventuri Could you re-run the failing workflow? Judging from the log files, it seems like the specific runner on GitHub side failed to install the dependencies. Re-running the jobs could solve it.

gventuri · 2023-10-05T20:05:19Z

@mspronesti done, merging :)

fix(helpers): separate cost of prompt and completion

ebb91e3

coderabbitai bot reviewed Oct 4, 2023

View reviewed changes

chore: fix ruff lint

f1bf92d

mspronesti force-pushed the fix/openai-info branch from 065ba14 to f1bf92d Compare October 4, 2023 23:11

coderabbitai bot reviewed Oct 4, 2023

View reviewed changes

mspronesti changed the title ~~fix(helpers): separate cost of prompt and completion~~ fix(helpers): separate cost of prompt and completion for OpenAI Oct 5, 2023

mspronesti changed the title ~~fix(helpers): separate cost of prompt and completion for OpenAI~~ fix(helpers): separate cost of prompt and completion for OpenAI models Oct 5, 2023

mspronesti changed the title ~~fix(helpers): separate cost of prompt and completion for OpenAI models~~ fix(helpers): cost of prompt and completion tokens for OpenAI models Oct 5, 2023

gventuri merged commit 297b36e into Sinaptik-AI:main Oct 5, 2023
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(helpers): cost of prompt and completion tokens for OpenAI models #618

fix(helpers): cost of prompt and completion tokens for OpenAI models #618

mspronesti commented Oct 4, 2023 •

edited

Loading

coderabbitai bot commented Oct 4, 2023 •

edited

Loading

Chat with CodeRabbit Bot (`@coderabbitai`)

Pause Incremental Reviews

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

codecov-commenter commented Oct 4, 2023

mspronesti commented Oct 5, 2023

gventuri commented Oct 5, 2023

fix(helpers): cost of prompt and completion tokens for OpenAI models #618

fix(helpers): cost of prompt and completion tokens for OpenAI models #618

Conversation

mspronesti commented Oct 4, 2023 • edited Loading

Summary by CodeRabbit

coderabbitai bot commented Oct 4, 2023 • edited Loading

Walkthrough

Changes

Chat with CodeRabbit Bot (@coderabbitai)

Pause Incremental Reviews

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

codecov-commenter commented Oct 4, 2023

Codecov Report

mspronesti commented Oct 5, 2023

gventuri commented Oct 5, 2023

mspronesti commented Oct 4, 2023 •

edited

Loading

coderabbitai bot commented Oct 4, 2023 •

edited

Loading

Chat with CodeRabbit Bot (`@coderabbitai`)