bugfix when initializing with async aoss vector store #17340

Aydin-ab · 2024-12-20T20:52:33Z

Description

@logan-markewich
Following up this issue [Bug]: Can't initialize OpensearchVectorClient through AWS endpoint in async mode #16746
16746

At the moment, we can't initialize a OpensSearchVectorClient using AOSS (amazon opensearch serverless) and/or an async connection. There are 2 problems here:

Problem 1: If initializing with AOSS (async or not), the code will fail when fetching the version number because this requires a GET / call to the endpoint. This works on standard Opensearch but OpenSearch Serverless doesn't support that command.
When using an async connection, the code will fail when getting or creating/refreshing the indices in the __init__(). This is because the code doesn't use the async client but only the "sync" one, this raises a TypeError due to a coroutine appearing somewhere.

Fixes # (issue)

Problem 1: The version number is used only to check the eligibility of the efficient-kNN filtering. I couldn't find any other ways to check the version of the cluster so I decided to remove that efficient-kNN filtering feature completely when we're using AOSS (we check self.aoss). Ideally, we'd like to find a workaround to check the version so we can still use efficient-kNN filtering while using AOSS.
Problem 2: I fix that with an extra except clause that catches a TypeError and retry with the async client this time.

New Package?

Had to import asyncio to run the async client but it's a python built-in anyway

Version Bump?

Yes
No

Type of Change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Not sure how to test that, we'd need to simulate an endpoint ?

I believe this change is already covered by existing unit tests

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added Google Colab support for the newly added notebooks.
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran make format; make lint to appease the lint gods

...tor_stores/llama-index-vector-stores-opensearch/llama_index/vector_stores/opensearch/base.py

logan-markewich · 2024-12-20T20:55:59Z

...tor_stores/llama-index-vector-stores-opensearch/llama_index/vector_stores/opensearch/base.py

@@ -123,14 +123,18 @@ def __init__(
        self._os_async_client = os_async_client or self._get_async_opensearch_client(


just curious, there's no way to create an sync client from an async client, or vice-versa? This would solve a lot of headaches if possible

I guess you could if you dig into the kwargs and change it accordingly for each sync/async instantiation

from opensearchpy import AsyncHttpConnection, AWSV4SignerAsyncAuth, RequestsHttpConnection, AWSV4SignerAuth # Sync parameters kwargs['auth'] = AWSV4SignerAuth(credentials, region, 'aoss') kwargs['connection_class'] = RequestsHttpConnection self._os_client = os_client or self._get_opensearch_client( self._endpoint, **kwargs ) # Async parameters kwargs['auth'] = AWSV4SignerAsyncAuth(credentials, region, 'aoss') kwargs['connection_class'] = AsyncHttpConnection self._os_client = os_client or self._get_opensearch_client( self._endpoint, **kwargs )

But you'd need the AWS credentials/region/service ('aoss') as additional inputs, seems too much

ah yea, thats pretty messy. Too bad

logan-markewich · 2024-12-20T21:27:02Z

Looks like you'll need to run make lint

…Aydin-ab/llama_index into bugfix-async-aoss-vector-stores

Aydin-ab · 2024-12-20T21:41:05Z

Looks like you'll need to run make lint

Sorry I don't know how that works :(

bugfix when initializing with async aoss

ff4215c

dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Dec 20, 2024

Merge branch 'main' into bugfix-async-aoss-vector-stores

841177e

logan-markewich reviewed Dec 20, 2024

View reviewed changes

...tor_stores/llama-index-vector-stores-opensearch/llama_index/vector_stores/opensearch/base.py Outdated Show resolved Hide resolved

logan-markewich reviewed Dec 20, 2024

View reviewed changes

Aydin-ab added 2 commits December 20, 2024 21:38

fix self.os_version error

6be8050

Merge branch 'bugfix-async-aoss-vector-stores' of https://github.com/…

1c6f818

…Aydin-ab/llama_index into bugfix-async-aoss-vector-stores

lint

48add0f

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Dec 20, 2024

logan-markewich added 2 commits December 20, 2024 16:14

vbump

c380264

use asyncio_run

ada5dec

logan-markewich merged commit 718e1f5 into run-llama:main Dec 23, 2024
10 of 11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfix when initializing with async aoss vector store #17340

bugfix when initializing with async aoss vector store #17340

Aydin-ab commented Dec 20, 2024 •

edited

Loading

logan-markewich Dec 20, 2024

Aydin-ab Dec 20, 2024

logan-markewich Dec 20, 2024

logan-markewich commented Dec 20, 2024

Aydin-ab commented Dec 20, 2024

		@@ -123,14 +123,18 @@ def __init__(
		self._os_async_client = os_async_client or self._get_async_opensearch_client(

bugfix when initializing with async aoss vector store #17340

bugfix when initializing with async aoss vector store #17340

Conversation

Aydin-ab commented Dec 20, 2024 • edited Loading

Description

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

logan-markewich Dec 20, 2024

Choose a reason for hiding this comment

Aydin-ab Dec 20, 2024

Choose a reason for hiding this comment

logan-markewich Dec 20, 2024

Choose a reason for hiding this comment

logan-markewich commented Dec 20, 2024

Aydin-ab commented Dec 20, 2024

Aydin-ab commented Dec 20, 2024 •

edited

Loading