-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LanceDB: removed mode from the add statement since mode should be append #13892
base: main
Are you sure you want to change the base?
LanceDB: removed mode from the add statement since mode should be append #13892
Conversation
…(which is the default value) and create is not an option for add, but is only a mode option for create_table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hragbalian -- makes sense to me. Added a small comment in my review.
@@ -348,7 +348,7 @@ def add( | |||
self._table_name, data, mode=self.mode | |||
) | |||
|
|||
self._table.add(data, mode=self.mode) | |||
self._table.add(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we can just add **add_kwargs here to handle the designated mode parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nerdai good point. I hadn't seen that there. I updated the PR with that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this test is failing still @hragbalian:
llama-index-integrations/vector_stores/llama-index-vector-stores-lancedb/tests/test_vector_stores_lancedb.py . [ 20%]
...F [100%]
=================================== FAILURES ===================================
_________________________________ test_delete __________________________________
index = <llama_index.core.indices.vector_store.base.VectorStoreIndex object at 0x7f8f46a14b20>
def test_delete(index: VectorStoreIndex) -> None:
index.delete(doc_id="test-0")
> assert index.vector_store._table.count_rows() == 2
E assert 12 == 2
E + where 12 = <bound method LanceTable.count_rows of LanceTable(connection=LanceDBConnection(/tmp/lancedb), name="vectors")>()
E + where <bound method LanceTable.count_rows of LanceTable(connection=LanceDBConnection(/tmp/lancedb), name="vectors")> = LanceTable(connection=LanceDBConnection(/tmp/lancedb), name="vectors").count_rows
E + where LanceTable(connection=LanceDBConnection(/tmp/lancedb), name="vectors") = LanceDBVectorStore(stores_text=True, is_embedding_query=True, flat_metadata=True, uri='/tmp/lancedb', vector_column_na..._key='text', doc_id_key='doc_id', api_key=None, region=None, mode='overwrite', query_type='vector', overfetch_factor=1)._table
E + where LanceDBVectorStore(stores_text=True, is_embedding_query=True, flat_metadata=True, uri='/tmp/lancedb', vector_column_na..._key='text', doc_id_key='doc_id', api_key=None, region=None, mode='overwrite', query_type='vector', overfetch_factor=1) = <llama_index.core.indices.vector_store.base.VectorStoreIndex object at 0x7f8f46a14b20>.vector_store
llama-index-integrations/vector_stores/llama-index-vector-stores-lancedb/tests/test_vector_stores_lancedb.py:42: AssertionError
Do you think you can take a look? You can try to test locally with pytest
or if you have pants
installed then you can use make test
as well.
@nerdai the |
@hragbalian I just ran the tests locally with and without this PR. Without this PR, all of the tests pass, but when I checkout the code for this PR and try to run tests, Looks like |
@hragbalian if I modify vector_store.add(nodes=nodes, mode="overwrite") |
@nerdai ok, I’ll take a closer look later this week to troubleshoot |
Really small update to remove the
mode
option from the lancedbadd
statement since it was sharing the mode option withcreate_table
, and I think the wrapper was intending it mainly for create_table. Two issues with sharing the mode withadd
:The options for mode are different for
add
than they are forcreate_table
: for add, the valid values are "append" and "overwrite" whereas forcreate_table
they are "create" and "overwrite".There is an unintended behavior when the value "overwrite" is specified for mode and is shared by the two methods: it will always overwrite the table thereby not allow you to build up the vector database with new embeddings.
Removing the
mode
fromadd
reverts the value to it's default which is "append", which should be what most situations will warrant. Alternatively, there could be a separate parameter to control the mode for add, but it really doesn't seem necessary.