prompt_tokens_details not being populated correctly in response model #1252

hnalla · 2024-12-12T06:29:24Z

[ - ] This is actually a bug report.
I am not getting good LLM Results
[ - ] I have tried asking for help in the community on discord or discussions and have not received a response.
[ - ] I have tried searching the documentation and have not found an answer.

What Model are you using?

[ -] gpt-3.5-turbo
[ -] gpt-4-turbo
[- ] gpt-4
Other (please specify)

Describe the bug
I'm unable to see how many tokens were cached when I try to use client.chat.completions.create_with_completion. completion.usage.prompt_token_details is always completion.usage.prompt_token_details

To Reproduce

import instructor
from time import sleep
from pydantic import BaseModel, field_validator
from openai import OpenAI
class FakeClass(BaseModel):
    insight: str
client = instructor.from_openai(OpenAI(api_key=OPENAI_API_KEY))
fake_str = "This is a Test string" * 1000

for x in range(0, 3):
    sleep(1)
    insight, completion = client.chat.completions.create_with_completion(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": fake_str,
            },
            {"role": "user", "content": "Test User Prompt"},
        ],
        max_retries=1,
        response_model=FakeClass,
    )
 
    print(completion.usage.prompt_token_details.model_dump())

I get this:

{'audio_tokens': 0, 'cached_tokens': 0}

Expected behavior
I expect to see how many token have been cached. I know OpenAI is caching it since I tried it without wrapping it around Instructor

open_ai_client = OpenAI(api_key=OPENAI_API_KEY)
open_ai_client.chat.completions.create

for x in range(0, 3):
    sleep(1)
    chat_completion = open_ai_client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": fake_str,
            },
            {"role": "user", "content": "Test User Prompt"},
        ],
    )

    cached_tokens = chat_completion.usage.prompt_tokens_details.cached_tokens
    if cached_tokens == 0:
        print("No cached tokens")
    else:
        print(f"Tokens used: {cached_tokens}")
        print(chat_completion.usage.prompt_tokens_details.model_dump())

I get this output

Tokens used: 4864
{'audio_tokens': 0, 'cached_tokens': 4864}

Screenshots
If applicable, add screenshots to help explain your problem.

The text was updated successfully, but these errors were encountered:

hnalla · 2024-12-12T06:38:03Z

This is happening due to a typo in

instructor/instructor/retry.py

Line 76 in bc7a7b3

prompt_token_details = PromptTokensDetails(audio_tokens=0, cached_tokens=0)

Its currently:

 total_usage = CompletionUsage(completion_tokens=0, prompt_tokens=0, total_tokens=0,
        completion_tokens_details = CompletionTokensDetails(audio_tokens=0, reasoning_tokens=0),
        prompt_token_details = PromptTokensDetails(audio_tokens=0, cached_tokens=0)
    )

Its supposed to be:

total_usage = CompletionUsage(completion_tokens=0, prompt_tokens=0, total_tokens=0,
        completion_tokens_details = CompletionTokensDetails(audio_tokens=0, reasoning_tokens=0),
        prompt_tokens_details = PromptTokensDetails(audio_tokens=0, cached_tokens=0)
    )

notice the typo in prompt_token_details vs prompt_tokens_details
I have tested in my local and it fixes it.

Supporting docs from OpenAI: https://github.com/openai/openai-python/blob/6e1161bc3ed20eef070063ddd5ac52fd9a531e88/src/openai/types/completion_usage.py#L53

This fixes #1252 by properly preserving the prompt_tokens_details information from the OpenAI response in the returned model. - Added test to verify token caching behavior - Modified process_response to preserve usage information Link to Devin run: https://app.devin.ai/sessions/d34daab99304486baa9643600abeef15 Co-Authored-By: [email protected] <[email protected]>

github-actions bot added the bug Something isn't working label Dec 12, 2024

hnalla changed the title ~~prompt_tokens_details not being populated correctly in reponse model~~ prompt_tokens_details not being populated correctly in response model Dec 12, 2024

hnalla added a commit to hnalla/instructor that referenced this issue Dec 12, 2024

fixes instructor-ai#1252

147e0ad

hnalla mentioned this issue Dec 12, 2024

fixes #1252 #1254

Merged

devin-ai-integration bot mentioned this issue Dec 15, 2024

fix: preserve prompt_tokens_details in response model #1259

Closed

jxnl closed this as completed in #1254 Dec 16, 2024

jxnl closed this as completed in f736fc1 Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prompt_tokens_details not being populated correctly in response model #1252

prompt_tokens_details not being populated correctly in response model #1252

hnalla commented Dec 12, 2024 •

edited

Loading

hnalla commented Dec 12, 2024 •

edited

Loading

prompt_tokens_details not being populated correctly in response model #1252

prompt_tokens_details not being populated correctly in response model #1252

Comments

hnalla commented Dec 12, 2024 • edited Loading

hnalla commented Dec 12, 2024 • edited Loading

hnalla commented Dec 12, 2024 •

edited

Loading

hnalla commented Dec 12, 2024 •

edited

Loading