-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bedrock] Implement prompt caching for supported models #4262
Comments
Thanks for the feature request. The Bedrock service team would need to implement support for this, as they own and maintain the underlying APIs (https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations.html). Upon searching internally, I found that there was already an internal feature request for the Bedrock team to track this. I've added your +1 and use case to the tracking item, and if there's any additional info you want us to pass along to the Bedrock team please let us know. I'm going to close the issue since this won't be tracked on the SDK side. But any updates to service APIs like this would be released to the SDKs. Going forward you can refer to the blog and CHANGELOG for updates. |
This issue is now closed. Comments on closed issues are hard for our team to see. |
Any news on this? It'd be awesome if we could track it |
Any updates on this? This would reduce Bedrock based LLM costs by up to 90%. |
Following up here and commenting for updates. This would indeed be helpful for us |
I'm meeting with a bunch of AWS SAs in the GenAI space tomorrow, I'm raising the lack of prompt caching as a detractor from Bedrock with clients and will reference this issue. I realise this is really a bedrock issue, but given bedrock is closed source and doesn't have a public issue tracker I'm not sure where folks can track and discuss it. |
Is there any tentative date for the release of this feature? |
Folks you're going to need to get onto your Amazon account reps / local SAs to push for this in Bedrock. |
Any updates on this? |
Waiting for updates! |
This would be very helpful if implemented, any updates? |
Following |
|
Following |
also following |
Press the follow button and give it a +1 rather that writing "following" which spans everyone with an email. |
following |
It has been released right now on re:Invent 2024. :) |
Describe the feature
Anthropic have introduced Prompt Caching for Claude which allows significantly faster inference. Adding support for this would be of significant value to customers.
Use Case
When using long system prompts or multi-agent applications, it can sometimes take up to 30 seconds or more for a response to be received by the customer. Prompt caching is an elegant and easy to implement means of reducing this significantly.
Proposed Solution
Add support for Prompt Caching.
Other Information
No response
Acknowledgements
SDK version used
Latest
Environment details (OS name and version, etc.)
All
The text was updated successfully, but these errors were encountered: