Support for GPU usage (w/ batching) when loading a KB using ElasticsearchQA #373

murali1996 · 2021-10-05T06:42:41Z

Loading a large KB while using the Elasticsearch Question Answerer with query_type {'embedder', 'embedder_text', 'embedder_keyword'} can be time consuming if the process of obtaining embeddings is not batched or is configured to use GPU when available.

What can be modified in the codebase:
This method def _doc_generator(data_file, embedder_model=None, embedding_fields=None): in the question_answerer.py file can first obtain all the embeddings of the all docs, dump the embeddings cache and then use the transform method on each doc while creating docs for elasticsearch index creation.

Optional comments on memory optimization:
The solution suggested above as well as the current implementation isn't optimized for embeddings held in RAM memory. Meaning, all the embeddings are kept in memory for elasticsearch to query embeddings of each KB doc. Maybe this is something to look into if we think of loading large KBs smoothly (say order of >50K documents).

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for GPU usage (w/ batching) when loading a KB using ElasticsearchQA #373

Support for GPU usage (w/ batching) when loading a KB using ElasticsearchQA #373

murali1996 commented Oct 5, 2021

Support for GPU usage (w/ batching) when loading a KB using ElasticsearchQA #373

Support for GPU usage (w/ batching) when loading a KB using ElasticsearchQA #373

Comments

murali1996 commented Oct 5, 2021