Token Exceed for librechat but not when directly invoking same model #4259
Replies: 3 comments 2 replies
-
Which model, via a custom endpoint? The system has a specific list of models and uses a default context window if it's not recognized. There will be a way to set the max context via config, but for now you can use the parameters for this: |
Beta Was this translation helpful? Give feedback.
-
@danny-avila I am using Sharing the librechat yaml. ---
|
Beta Was this translation helpful? Give feedback.
-
That specific model should have context recognized, odd that it isn't for you: even when I use the exact model identifier: To understand what's going on can you share your debug logs? If you are using docker, they are saved in the Reproduce the issue then check the latest debug logs (files starting with "debug" in name, i.e. |
Beta Was this translation helpful? Give feedback.
-
What happened?
Token Exceed for librechat but not when directly invoking model with the same prompt
I have found one issue that I am getting the following error with librechat but when I am directly invoking the same model via streamlit + AWS lambda I am getting a proper response.
{"level":"error","message":"[handleAbortError] AI response error; aborting request: Prompt token count of 37165 exceeds max token count of 4095.","stack":"Error: Prompt token count of 37165 exceeds max token count of 4095.\n at OpenAIClient.handleContextStrategy (/app/api/app/clients/BaseClient.js:374:13)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async OpenAIClient.buildMessages (/app/api/app/clients/OpenAIClient.js:559:61)
Steps to Reproduce
I am working on internal 2 hours transcript data.
So I cant share it here.
What browsers are you seeing the problem on?
No response
Relevant log output
Screenshots
No response
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions