-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(helpers): cost of prompt and completion tokens for OpenAI models #618
Conversation
WalkthroughThe changes introduce new model names for GPT-4 and GPT-3.5 in the Changes
TipsChat with CodeRabbit Bot (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Files selected for processing (1)
- pandasai/helpers/openai_info.py (2 hunks)
Additional comments (Suppressed): 3
pandasai/helpers/openai_info.py (3)
10-44: The
MODEL_COST_PER_1K_TOKENS
dictionary has been updated to include new model names for GPT-4, GPT-3.5, and Azure GPT-35, along with their respective costs per 1000 tokens. The cost values are now differentiated between input and output (completion) tokens for each model. Ensure that these cost values are accurate and up-to-date.47-75: The function
get_openai_token_cost_for_model()
has been modified to handle completion tokens separately. A new argumentis_completion
has been added to the function signature. Ifis_completion
is True and the model name starts with "gpt-4", "gpt-3.5", or "gpt-35", the function appends "-completion" to the model name before looking up the cost in theMODEL_COST_PER_1K_TOKENS
dictionary. This change allows the function to calculate the cost of prompt tokens and completion tokens independently, enhancing the accuracy of token cost calculation for different models and operations. Make sure all calls to this function throughout the codebase have been updated to match the new signature.99-112: In the
__call__
method of theOpenAICallbackHandler
class, the total cost calculation has been updated to separately account for the cost of prompt tokens and completion tokens. Theget_openai_token_cost_for_model()
function is called twice, once for prompt tokens and once for completion tokens, with theis_completion
argument set to True for the latter. The costs are then added together to get the total cost. This change improves the precision of the cost tracking mechanism by treating prompt tokens and completion tokens as distinct entities. Ensure that this modification does not introduce any unexpected side effects in the overall cost calculation process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Files selected for processing (1)
- pandasai/helpers/openai_info.py (2 hunks)
Additional comments (Suppressed): 3
pandasai/helpers/openai_info.py (3)
10-44: The
MODEL_COST_PER_1K_TOKENS
dictionary has been updated to include new model names for GPT-4, GPT-3.5, and Azure GPT-35. The cost per 1000 tokens is now differentiated between input (prompt) and output (completion) tokens. Ensure that these costs are accurate and up-to-date with OpenAI's pricing.47-75: The function
get_openai_token_cost_for_model()
has been modified to handle prompt and completion tokens separately. An additional parameteris_completion
has been introduced to indicate whether the tokens are for completion or not. This change allows for more accurate cost calculation based on token type. However, ensure that all calls to this function throughout the codebase have been updated to match the new signature.- def get_openai_token_cost_for_model( - model_name: str, - num_tokens: int, - ) -> float: + def get_openai_token_cost_for_model( + model_name: str, + num_tokens: int, + is_completion: bool = False, + ) -> float:
- 99-114: The
__call__
method in theOpenAICallbackHandler
class has been updated to calculate the cost of prompt tokens and completion tokens independently. This separation of cost calculation enhances the accuracy of the system's token cost evaluation. Ensure that theresponse.usage
object contains bothprompt_tokens
andcompletion_tokens
fields.- total_cost = get_openai_token_cost_for_model(model_name, usage.total_tokens) - self.total_cost += total_cost + prompt_cost = get_openai_token_cost_for_model( + model_name, usage.prompt_tokens + ) + completion_cost = get_openai_token_cost_for_model( + model_name, usage.completion_tokens, is_completion=True + ) + self.total_cost += prompt_cost + completion_cost
065ba14
to
f1bf92d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Files selected for processing (1)
- pandasai/helpers/openai_info.py (2 hunks)
Additional comments (Suppressed): 3
pandasai/helpers/openai_info.py (3)
10-44: The
MODEL_COST_PER_1K_TOKENS
dictionary has been updated to include new model names for GPT-4, GPT-3.5, and Azure GPT-35. The cost per 1000 tokens is now differentiated between input and output (completion) for each model. Ensure that these costs are accurate and up-to-date with OpenAI's pricing.47-75: The function
get_openai_token_cost_for_model()
has been modified to include an additional parameteris_completion
. This parameter is used to differentiate between prompt and completion tokens when calculating the cost. Ifis_completion
is True, "-completion" is appended to the model name to fetch the correct cost from theMODEL_COST_PER_1K_TOKENS
dictionary. This change allows for more precise cost calculation by treating prompt and completion tokens separately.- def get_openai_token_cost_for_model( - model_name: str, - num_tokens: int, - ) -> float: + def get_openai_token_cost_for_model( + model_name: str, + num_tokens: int, + is_completion: bool = False, + ) -> float:
- 99-114: The
__call__()
method ofOpenAICallbackHandler
has been updated to calculate the cost of prompt tokens and completion tokens separately. It uses theget_openai_token_cost_for_model()
function with theis_completion
parameter set accordingly. This change aligns with the modification in theget_openai_token_cost_for_model()
function and ensures a more accurate total cost calculation.- total_cost = get_openai_token_cost_for_model(model_name, usage.total_tokens) - self.total_cost += total_cost + prompt_cost = get_openai_token_cost_for_model( + model_name, usage.prompt_tokens + ) + completion_cost = get_openai_token_cost_for_model( + model_name, usage.completion_tokens, is_completion=True + ) + self.total_cost += prompt_cost + completion_cost
Codecov Report
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. @@ Coverage Diff @@
## main #618 +/- ##
==========================================
+ Coverage 83.53% 83.54% +0.01%
==========================================
Files 55 55
Lines 2696 2699 +3
==========================================
+ Hits 2252 2255 +3
Misses 444 444
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@gventuri Could you re-run the failing workflow? Judging from the log files, it seems like the specific runner on GitHub side failed to install the dependencies. Re-running the jobs could solve it. |
@mspronesti done, merging :) |
Hi @gventuri,
this PR aims at separating the cost evaluation of prompt tokens and completion/chat completion tokens, which are charged differently.
Summary by CodeRabbit
get_openai_token_cost_for_model
function to distinguish between prompt tokens and completion tokens, improving cost calculation accuracy.__call__
method to separately calculate costs for prompt and completion tokens, providing more granular control over token usage.