You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Loading a large KB while using the Elasticsearch Question Answerer with query_type {'embedder', 'embedder_text', 'embedder_keyword'} can be time consuming if the process of obtaining embeddings is not batched or is configured to use GPU when available.
Optional comments on memory optimization:
The solution suggested above as well as the current implementation isn't optimized for embeddings held in RAM memory. Meaning, all the embeddings are kept in memory for elasticsearch to query embeddings of each KB doc. Maybe this is something to look into if we think of loading large KBs smoothly (say order of >50K documents).
The text was updated successfully, but these errors were encountered:
Loading a large KB while using the Elasticsearch Question Answerer with query_type {'embedder', 'embedder_text', 'embedder_keyword'} can be time consuming if the process of obtaining embeddings is not batched or is configured to use GPU when available.
What can be modified in the codebase:
This method def _doc_generator(data_file, embedder_model=None, embedding_fields=None): in the question_answerer.py file can first obtain all the embeddings of the all docs, dump the embeddings cache and then use the
transform
method on each doc while creating docs for elasticsearch index creation.Optional comments on memory optimization:
The solution suggested above as well as the current implementation isn't optimized for embeddings held in RAM memory. Meaning, all the embeddings are kept in memory for elasticsearch to query embeddings of each KB doc. Maybe this is something to look into if we think of loading large KBs smoothly (say order of >50K documents).
The text was updated successfully, but these errors were encountered: