-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable support within aws_bedrockagent_knowledge_base for embedding_model_configuration and supplemental_data_storage_configuration #40737
base: main
Are you sure you want to change the base?
Conversation
…odel_configuration and supplemental_data_storage_configuration * extend appropriate go schema * add necessary go structs * add acceptance tests * extend docs as necessary * fix broken dash characters found in docs
Community NoteVoting for Prioritization
For Submitters
|
It's worth further noting that the build environment (make tools) was broken as of my pull of main on Christmas. I had to hand remediate that situation to be able to work at all. I documented my mitigation in the following post but I'm not sure how we navigate that separate issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Welcome @awgibbs 👋
It looks like this is your first Pull Request submission to the Terraform AWS Provider! If you haven’t already done so please make sure you have checked out our CONTRIBUTOR guide and FAQ to make sure your contribution is adhering to best practice and has all the necessary elements in place for a successful approval.
Also take a look at our FAQ which details how we prioritize Pull Requests for inclusion.
Thanks again, and welcome to the community! 😃
@ewbankkit and other code owners -- Is there additional work that is wanted on this PR (to be done either by myself or others) or is it just waiting for maintainers to have cycles to transact on it? I understand that you all have a lot on your plates. I just don't want this to get slowed down for want of something I might have needed to do to make it pass the bar. I'm trying to support an urgent business need with this and others are anxious to be able to start to use it. |
@ewbankkit I realized after dropping my last note that there was actually one failed check, versus skipped ones, among the list. I just pushed a fresh commit that addresses the semgrep failure which I first reproduced locally and then accepted the automatic changes. I also executed a fully realistic run of the new fancyOpenSearch test which passed. Andrews-MBP:terraform-provider-aws awgibbs$ semgrep --config .ci/.semgrep.yml --config .ci/.semgrep-constants.yml --config .ci/.semgrep-test-constants.yml --config .ci/semgrep/ internal/service/bedrockagent/knowledge_base.go Andrews-MBP:terraform-provider-aws awgibbs$ make testacc TESTS=TestAccBedrockAgent_serial/KnowledgeBase/fancyOpenSearch PKG=bedrockagent |
@ewbankkit Looks like my last commit got the "checks" to a happy place. Is there anything else I can do to help ready this for merge and release? |
@awgibbs - thanks for your effort on this! Given the pre-existing acceptance tests requiring OpenSearch vector stores also do not automate the setup steps, I think we can proceed without requiring that here. That said, it would be great to have a formal writeup of the setup steps so that a maintainer can replicate it and run the full test suite once, at least for this initial implementation. If you can provide that (or link to AWS docs), we can embed the steps as comments above the test case(s) and embellish with any provider specific notes as necessary. I'll also open a follow-up issue to investigate automation of the setup for both OpenSearch based acceptance tests so we don't lose track of that work stream. |
Hey @jar-b -- Stoked to finish the swing on this! I think this is probably the best AWS documentation for our present purposes ==> https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-create.html When I did my own testing I just slapped an additional Bedrock Knowledgebase (using my rev'd provider) on an existing OpenSearch Serverless Collection that had been created with Terraform. I imagine a more generalized manual approach here would be to follow the docs ^^ to create a BKB in the console, which when using the "Quick create a new vector store" option will automate creating the underlying OSSC, and then create an identical-ish BKB using my rev'd Terraform provider as in my "fancy OpenSearch" test. It's important to match up the vector type and dimensions across BKB and OSSC or the BKB creation step will throw an error. Maybe the easiest thing to do beyond that is to ensure that the "fancy OpenSearch" test plugs in whatever values come out of the "Quick create a new vector store" console workflow by default to minimize any last second massaging of the test suite (maybe just plugging in the service role would be the only thing required if we did that?). To the extent it's helpful I'm happy to keep pulling on the oars myself to get this over the finish line. Maybe I should inline some docs in internal/service/bedrockagent/knowledge_base_test.go by adding a "Prerequisites" comments block ahead of testAccKnowledgeBaseConfig_fancyOpenSearch that captures the foregoing? Or I can just be close air support on whatever final touches you want to make yourself. Let me know how to help you help me. Thanks! :-) |
… steps relatively painless
Description
Last month AWS introduced binary embedding support for Amazon Titan Text Embeddings V2. This PR makes it possible to choose that embedding data type as well as configure the dimensions. For good measure it also adds support for supplemental storage configuration.
I have created and successfully run a new acceptance test. I should note, however, that testing against OpenSearch Serverless Collections (what I was personally targeting) is difficult generally and a polished implementation for acceptance tests seems to have been deferred when this resource was created some months, resulting in a somewhat difficult and manually intensive situation for myself as there was no perfect/automated model to follow. You will note that the extant OSSC tests were set to "skip" and that is how I am committing my new one (though it was not skipped for my actual testing). To perform my testing I pointed at an appropriate extant/external OSSC and then had the KB created by the acceptance test runs point at that. The relevant parameters are XXX'd out in the acceptance test. I have successfully tested against both "BINARY" and "FLOAT32" index data types backed by real OSSC instances created out-of-band.
In real life fully automated environments I have used the aws_lambda_invocation resource to execute post-creation OSSC manipulations to get an index in place. Bedrock KB creation fails without this underlying index being in place because it blows up when doing validation. I think it would be reasonable to continue to defer this realm as tech debt but to circle back imminently to get these acceptance tests into a better place across the board (not just mine; and I am happy to help across the board). At the moment, however, I am in a rather urgent situation where the impetus to have done this is the need to convert vector DBs in an operational environment to the "BINARY" data type.
Closes #40576.
References
https://aws.amazon.com/blogs/machine-learning/build-cost-effective-rag-applications-with-binary-embeddings-in-amazon-titan-text-embeddings-v2-amazon-opensearch-serverless-and-amazon-bedrock-knowledge-bases/
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-agent/client/create_knowledge_base.html
Output from Acceptance Testing