Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: weaviate client has no collection attributes #13614

Closed
arsyad2281 opened this issue May 21, 2024 · 11 comments · May be fixed by #13719
Closed

[Bug]: weaviate client has no collection attributes #13614

arsyad2281 opened this issue May 21, 2024 · 11 comments · May be fixed by #13719
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@arsyad2281
Copy link

Bug Description

I am initializing WeaviateVectorStore based on the example given in this link

However, I am getting this error:
AttributeError: 'Client' object has no attribute 'collections'

Python Version - 3.11.9

Version of llama index and relevant packages
llama-index==0.10.37
llama-index-vector-stores-weaviate==1.0.0

Version

0.10.37

Steps to Reproduce

Please replace <your-username>, <your-password> and <your-index-name> accordingly.
WeaviateDB is locally hosted through docker with this image: semitechnologies/weaviate:1.23.9

from llama_index.vector_stores.weaviate import WeaviateVectorStore
import weaviate

weaviate_url = "http://localhost:8080"

resource_owner_config = weaviate.AuthClientPassword(
    username=<your-username>,
    password=<your-password>,
)

client = weaviate.Client(
    weaviate_url,
    auth_client_secret=resource_owner_config,
)

vector_store = WeaviateVectorStore(
    weaviate_client=client, index_name=<your-index-name>
)

Relevant Logs/Tracbacks

AttributeError                            Traceback (most recent call last)
Cell In[22], line 1
----> 1 vector_store = WeaviateVectorStore(
      2     weaviate_client=client, index_name="Ccassistant"
      3 )

File ~/homebrew/Caskroom/miniconda/base/envs/test_env/lib/python3.11/site-packages/llama_index/vector_stores/weaviate/base.py:172, in WeaviateVectorStore.__init__(self, weaviate_client, class_prefix, index_name, text_key, auth_config, client_kwargs, url, **kwargs)
    167     raise ValueError(
    168         "Index name must start with a capital letter, e.g. 'LlamaIndex'"
    169     )
    171 # create default schema if does not exist
--> 172 if not class_schema_exists(self._client, index_name):
    173     create_default_schema(self._client, index_name)
    175 super().__init__(
    176     url=url,
    177     index_name=index_name,
   (...)
    180     client_kwargs=client_kwargs or {},
    181 )

File ~/homebrew/Caskroom/miniconda/base/envs/test_env/lib/python3.11/site-packages/llama_index/vector_stores/weaviate/utils.py:76, in class_schema_exists(client, class_name)
     74 """Check if class schema exists."""
     75 validate_client(client)
---> 76 return client.collections.exists(class_name)

AttributeError: 'Client' object has no attribute 'collections'
@arsyad2281 arsyad2281 added bug Something isn't working triage Issue needs to be triaged/prioritized labels May 21, 2024
Copy link

dosubot bot commented May 21, 2024

The error AttributeError: 'Client' object has no attribute 'collections' occurs because the weaviate.Client object does not have a collections attribute. The correct attribute to use for schema-related operations is schema.

To resolve this issue, update the code to use the schema attribute instead of collections. Here is the corrected code snippet for initializing the WeaviateVectorStore and checking the class schema:

  1. Update the class_schema_exists function to use the schema attribute:
def class_schema_exists(client: weaviate.Client, class_name: str) -> bool:
    """Check if class schema exists."""
    validate_client(client)
    return client.schema.exists(class_name)
  1. Ensure that the delete_index method in the WeaviateVectorStore class uses the schema attribute:
def delete_index(self) -> None:
    """Delete the index associated with the client.

    Raises:
    - Exception: If the deletion fails, for some reason.
    """
    if not class_schema_exists(self._client, self.index_name):
        _logger.warning(
            f"Index '{self.index_name}' does not exist. No action taken."
        )
        return
    try:
        self._client.schema.delete_class(self.index_name)
        _logger.info(f"Successfully deleted index '{self.index_name}'.")
    except Exception as e:
        _logger.error(f"Failed to delete index '{self.index_name}': {e}")
        raise Exception(f"Failed to delete index '{self.index_name}': {e}")
  1. Ensure that your initialization code follows the correct structure:
from llama_index.vector_stores.weaviate import WeaviateVectorStore
import weaviate

weaviate_url = "http://localhost:8080"

resource_owner_config = weaviate.AuthClientPassword(
    username="<your-username>",
    password="<your-password>",
)

client = weaviate.Client(
    weaviate_url,
    auth_client_secret=resource_owner_config,
)

vector_store = WeaviateVectorStore(
    weaviate_client=client, index_name="<your-index-name>"
)

Additionally, ensure that you are using the correct versions of the dependencies:

  • llama-index-core version ^0.10.1
  • weaviate-client version ^4.5.7

You can install the required packages using:

pip install llama-index-vector-stores-weaviate weaviate-client

This should resolve the issue and allow you to use the WeaviateVectorStore as intended.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@arsyad2281
Copy link
Author

Do note that collection is also called in this function - link, where collection is expected to be an object in L328 :

def query(self, query: VectorStoreQuery, **kwargs: Any) -> VectorStoreQueryResult:
        """Query index for top k most similar nodes."""
        all_properties = get_all_properties(self._client, self.index_name)
        collection = self._client.collections.get(self.index_name)
        filters = None
        ...
        ...
        # execute query
        try:
            query_result = collection.query.hybrid(
                query=query.query_str,
                vector=vector,
                alpha=alpha,
                limit=limit,
                filters=filters,
                return_metadata=return_metatada,
                return_properties=all_properties,
                include_vector=True,
            )
        except weaviate.exceptions.WeaviateQueryError as e:
            raise ValueError(f"Invalid query, got errors: {e.message}")
        ...
        ...

@chrisk314
Copy link

I'm also hitting this with dependencies: llama-index-core=0.10.38.post1, llama-index-vector-stores-weaviate=1.0.0, weaviate-client=4.6.3.

@chrisk314
Copy link

Looks like this is a user error, apologies. I was attempting to use a weaviate v3 client (weaviate.Client) where LlamaIndex is expecting a weaviate v4 client (weaviate.WeaviateClient). Using the below code to create a client for a local weaviate instance is working fine. I'd suggest this can be closed.

from llama_index.vector_stores.weaviate import WeaviateVectorStore
import weaviate

client = weaviate.connect_to_local()
vector_store = WeaviateVectorStore(weaviate_client=client)

@chrisk314
Copy link

@arsyad2281 from your issue description, it looks like you made the same mistake I did. Change your code as per my comment above and you should be good.

@chrisk314
Copy link

On closer inspection I realised that the WeaviateVectorStore.from_params method was creating a Weaviate V3 client, where the rest of the code is expecting a Weaviate V4 client. I made a PR which changes all the code to require and use Weaviate V4 clients. This choice makes the updated code consistent and error free with respect to the rest of the existing code; and ensures that the code is future proofed by migrating to Weaviate V4 now.

@krisz094
Copy link

I just ran into a similar issue, where I try to ingest modified files into my Weaviate database, so basically doing an upsert.

I think it's caused by the base.py code using mixed V3 and V4 code too.

The error I get:

Traceback (most recent call last):
  File "/foo/bar.py", line 56, in <module>
    nodes = pipeline.run(documents=documents)
  File "/home/krisz/.pyenv/versions/3.10.13/lib/python3.10/site-packages/llama_index/core/ingestion/pipeline.py", line 682, in run
    nodes_to_run = self._handle_upserts(
  File "/home/krisz/.pyenv/versions/3.10.13/lib/python3.10/site-packages/llama_index/core/ingestion/pipeline.py", line 612, in _handle_upserts
    self.vector_store.delete(ref_doc_id)
  File "/home/krisz/.pyenv/versions/3.10.13/lib/python3.10/site-packages/llama_index/vector_stores/weaviate/base.py", line 261, in delete
    self._client.query.get(self.index_name)
AttributeError: 'WeaviateClient' object has no attribute 'query'

The relevant part in base.py:

query = (
            self._client.query.get(self.index_name)
            .with_additional(["id"])
            .with_where(where_filter)
            .with_limit(10000)  # 10,000 is the max weaviate can fetch
        )

The _client.query doesn't exist in V4 anymore, I think.

@chrisk314
Copy link

@krisz094 thanks for pointing that out. I've pushed a commit dc8f15e to #13719 which fixes the WeaviateVectorStore.delete method. Just tested manually and it's working now. I didn't test it out with additional delete_kwargs["filters"] yet.

Weaviate and LlamaIndex code bases are both very new to me; I've based my fix for the delete method off code that's in the query method and what I can piece together from the Weaviate code and docs. I'm pushed for time right now and it's a long weekend in the UK. I'll be able to look more in depth at the rest of the WeaviateVectorStore code on Tuesday and test further.

@chrisk314
Copy link

Just following up on this... I've now manually tested all public methods of the WeaviateVectorStore with the updates in the PR and all looks good to me.

Any timeline for a review from any of the core contributors?

@brenkehoe
Copy link
Contributor

#13365 should fix delete method @chrisk314

@chrisk314
Copy link

@brenkehoe ok thanks for pointing that out. I've now merged the latest changes from main into #13719 taking the delete implementation from main.

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 2, 2024
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 9, 2024
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants