Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bedrock] Implement prompt caching for supported models #4262

Closed
2 tasks
richardburleigh opened this issue Sep 9, 2024 · 19 comments
Closed
2 tasks

[Bedrock] Implement prompt caching for supported models #4262

richardburleigh opened this issue Sep 9, 2024 · 19 comments
Assignees
Labels
bedrock-runtime feature-request This issue requests a feature. service-api This issue is caused by the service API, not the SDK implementation.

Comments

@richardburleigh
Copy link

richardburleigh commented Sep 9, 2024

Describe the feature

Anthropic have introduced Prompt Caching for Claude which allows significantly faster inference. Adding support for this would be of significant value to customers.

Use Case

When using long system prompts or multi-agent applications, it can sometimes take up to 30 seconds or more for a response to be received by the customer. Prompt caching is an elegant and easy to implement means of reducing this significantly.

Proposed Solution

Add support for Prompt Caching.

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

SDK version used

Latest

Environment details (OS name and version, etc.)

All

@richardburleigh richardburleigh added feature-request This issue requests a feature. needs-triage This issue or PR still needs to be triaged. labels Sep 9, 2024
@tim-finnigan tim-finnigan self-assigned this Sep 9, 2024
@tim-finnigan
Copy link
Contributor

Thanks for the feature request. The Bedrock service team would need to implement support for this, as they own and maintain the underlying APIs (https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations.html).

Upon searching internally, I found that there was already an internal feature request for the Bedrock team to track this. I've added your +1 and use case to the tracking item, and if there's any additional info you want us to pass along to the Bedrock team please let us know. I'm going to close the issue since this won't be tracked on the SDK side. But any updates to service APIs like this would be released to the SDKs. Going forward you can refer to the blog and CHANGELOG for updates.

@tim-finnigan tim-finnigan added service-api This issue is caused by the service API, not the SDK implementation. and removed needs-triage This issue or PR still needs to be triaged. labels Sep 9, 2024
Copy link

github-actions bot commented Sep 9, 2024

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

@juan-abia
Copy link

juan-abia commented Sep 19, 2024

Any news on this? It'd be awesome if we could track it

@sammcj
Copy link

sammcj commented Oct 19, 2024

Any updates on this? This would reduce Bedrock based LLM costs by up to 90%.

@devtanna
Copy link

Following up here and commenting for updates. This would indeed be helpful for us

@sammcj
Copy link

sammcj commented Oct 23, 2024

I'm meeting with a bunch of AWS SAs in the GenAI space tomorrow, I'm raising the lack of prompt caching as a detractor from Bedrock with clients and will reference this issue.

I realise this is really a bedrock issue, but given bedrock is closed source and doesn't have a public issue tracker I'm not sure where folks can track and discuss it.

@FarazAhmedSid
Copy link

Is there any tentative date for the release of this feature?

@sammcj
Copy link

sammcj commented Oct 24, 2024

Folks you're going to need to get onto your Amazon account reps / local SAs to push for this in Bedrock.

@mpbrenlla
Copy link

mpbrenlla commented Nov 5, 2024

Any updates on this?

@ajinkyaT
Copy link

ajinkyaT commented Nov 8, 2024

Waiting for updates!

@carl-krikorian
Copy link

This would be very helpful if implemented, any updates?

@skapoor-coatue
Copy link

Following

@ShivamSphn
Copy link

  • more caching time like openai

@JavierArredondo
Copy link

Following

@emorgan-korrobio
Copy link

also following

@sammcj
Copy link

sammcj commented Nov 21, 2024

Press the follow button and give it a +1 rather that writing "following" which spans everyone with an email.

@marcelslum
Copy link

following

@tomaszdudek7
Copy link

It has been released right now on re:Invent 2024. :)

@ranman
Copy link

ranman commented Dec 4, 2024

https://aws.amazon.com/blogs/aws/reduce-costs-and-latency-with-amazon-bedrock-intelligent-prompt-routing-and-prompt-caching-preview/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bedrock-runtime feature-request This issue requests a feature. service-api This issue is caused by the service API, not the SDK implementation.
Projects
None yet
Development

No branches or pull requests