[Bedrock] Implement prompt caching for supported models #4262

richardburleigh · 2024-09-09T10:50:48Z

Describe the feature

Anthropic have introduced Prompt Caching for Claude which allows significantly faster inference. Adding support for this would be of significant value to customers.

Use Case

When using long system prompts or multi-agent applications, it can sometimes take up to 30 seconds or more for a response to be received by the customer. Prompt caching is an elegant and easy to implement means of reducing this significantly.

Proposed Solution

Add support for Prompt Caching.

Other Information

No response

Acknowledgements

I may be able to implement this feature request
This feature might incur a breaking change

SDK version used

Latest

Environment details (OS name and version, etc.)

All

tim-finnigan · 2024-09-09T20:12:17Z

Thanks for the feature request. The Bedrock service team would need to implement support for this, as they own and maintain the underlying APIs (https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations.html).

Upon searching internally, I found that there was already an internal feature request for the Bedrock team to track this. I've added your +1 and use case to the tracking item, and if there's any additional info you want us to pass along to the Bedrock team please let us know. I'm going to close the issue since this won't be tracked on the SDK side. But any updates to service APIs like this would be released to the SDKs. Going forward you can refer to the blog and CHANGELOG for updates.

github-actions · 2024-09-09T20:12:33Z

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

juan-abia · 2024-09-19T15:18:43Z

Any news on this? It'd be awesome if we could track it

sammcj · 2024-10-19T00:46:28Z

Any updates on this? This would reduce Bedrock based LLM costs by up to 90%.

devtanna · 2024-10-23T09:44:02Z

Following up here and commenting for updates. This would indeed be helpful for us

sammcj · 2024-10-23T09:46:50Z

I'm meeting with a bunch of AWS SAs in the GenAI space tomorrow, I'm raising the lack of prompt caching as a detractor from Bedrock with clients and will reference this issue.

I realise this is really a bedrock issue, but given bedrock is closed source and doesn't have a public issue tracker I'm not sure where folks can track and discuss it.

FarazAhmedSid · 2024-10-24T10:52:01Z

Is there any tentative date for the release of this feature?

sammcj · 2024-10-24T10:59:53Z

Folks you're going to need to get onto your Amazon account reps / local SAs to push for this in Bedrock.

mpbrenlla · 2024-11-05T08:56:43Z

Any updates on this?

ajinkyaT · 2024-11-08T05:40:42Z

Waiting for updates!

carl-krikorian · 2024-11-11T11:51:42Z

This would be very helpful if implemented, any updates?

skapoor-coatue · 2024-11-15T15:59:48Z

Following

ShivamSphn · 2024-11-18T21:21:38Z

more caching time like openai

JavierArredondo · 2024-11-21T14:34:19Z

Following

emorgan-korrobio · 2024-11-21T18:46:16Z

also following

sammcj · 2024-11-21T19:26:14Z

Press the follow button and give it a +1 rather that writing "following" which spans everyone with an email.

marcelslum · 2024-11-22T00:46:29Z

following

tomaszdudek7 · 2024-12-04T17:21:43Z

It has been released right now on re:Invent 2024. :)

ranman · 2024-12-04T17:24:43Z

https://aws.amazon.com/blogs/aws/reduce-costs-and-latency-with-amazon-bedrock-intelligent-prompt-routing-and-prompt-caching-preview/

richardburleigh added feature-request This issue requests a feature. needs-triage This issue or PR still needs to be triaged. labels Sep 9, 2024

tim-finnigan self-assigned this Sep 9, 2024

tim-finnigan closed this as completed Sep 9, 2024

tim-finnigan added service-api This issue is caused by the service API, not the SDK implementation. and removed needs-triage This issue or PR still needs to be triaged. labels Sep 9, 2024

tim-finnigan added the bedrock-runtime label Sep 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bedrock] Implement prompt caching for supported models #4262

[Bedrock] Implement prompt caching for supported models #4262

richardburleigh commented Sep 9, 2024 •

edited

Loading

tim-finnigan commented Sep 9, 2024

github-actions bot commented Sep 9, 2024

juan-abia commented Sep 19, 2024 •

edited

Loading

sammcj commented Oct 19, 2024

devtanna commented Oct 23, 2024

sammcj commented Oct 23, 2024 •

edited

Loading

FarazAhmedSid commented Oct 24, 2024

sammcj commented Oct 24, 2024

mpbrenlla commented Nov 5, 2024 •

edited

Loading

ajinkyaT commented Nov 8, 2024

carl-krikorian commented Nov 11, 2024

skapoor-coatue commented Nov 15, 2024

ShivamSphn commented Nov 18, 2024

JavierArredondo commented Nov 21, 2024

emorgan-korrobio commented Nov 21, 2024

sammcj commented Nov 21, 2024

marcelslum commented Nov 22, 2024

tomaszdudek7 commented Dec 4, 2024

ranman commented Dec 4, 2024

[Bedrock] Implement prompt caching for supported models #4262

[Bedrock] Implement prompt caching for supported models #4262

Comments

richardburleigh commented Sep 9, 2024 • edited Loading

Describe the feature

Use Case

Proposed Solution

Other Information

Acknowledgements

SDK version used

Environment details (OS name and version, etc.)

tim-finnigan commented Sep 9, 2024

github-actions bot commented Sep 9, 2024

juan-abia commented Sep 19, 2024 • edited Loading

sammcj commented Oct 19, 2024

devtanna commented Oct 23, 2024

sammcj commented Oct 23, 2024 • edited Loading

FarazAhmedSid commented Oct 24, 2024

sammcj commented Oct 24, 2024

mpbrenlla commented Nov 5, 2024 • edited Loading

ajinkyaT commented Nov 8, 2024

carl-krikorian commented Nov 11, 2024

skapoor-coatue commented Nov 15, 2024

ShivamSphn commented Nov 18, 2024

JavierArredondo commented Nov 21, 2024

emorgan-korrobio commented Nov 21, 2024

sammcj commented Nov 21, 2024

marcelslum commented Nov 22, 2024

tomaszdudek7 commented Dec 4, 2024

ranman commented Dec 4, 2024

richardburleigh commented Sep 9, 2024 •

edited

Loading

juan-abia commented Sep 19, 2024 •

edited

Loading

sammcj commented Oct 23, 2024 •

edited

Loading

mpbrenlla commented Nov 5, 2024 •

edited

Loading