forked from langchain-ai/langchain
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
community[minor]: Add ITREX optimized Embeddings (langchain-ai#18474)
Introduction [Intel® Extension for Transformers](https://github.com/intel/intel-extension-for-transformers) is an innovative toolkit designed to accelerate GenAI/LLM everywhere with the optimal performance of Transformer-based models on various Intel platforms Description adding ITREX runtime embeddings using intel-extension-for-transformers. added mdx documentation and example notebooks added embedding import testing. --------- Signed-off-by: yuwenzho <[email protected]> Co-authored-by: Bagatur <[email protected]>
- Loading branch information
Showing
7 changed files
with
365 additions
and
26 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# Intel | ||
|
||
>[Optimum Intel](https://github.com/huggingface/optimum-intel?tab=readme-ov-file#optimum-intel) is the interface between the 🤗 Transformers and Diffusers libraries and the different tools and libraries provided by Intel to accelerate end-to-end pipelines on Intel architectures. | ||
>[Intel® Extension for Transformers](https://github.com/intel/intel-extension-for-transformers?tab=readme-ov-file#intel-extension-for-transformers) (ITREX) is an innovative toolkit designed to accelerate GenAI/LLM everywhere with the optimal performance of Transformer-based models on various Intel platforms, including Intel Gaudi2, Intel CPU, and Intel GPU. | ||
This page covers how to use optimum-intel and ITREX with LangChain. | ||
|
||
## Optimum-intel | ||
|
||
All functionality related to the [optimum-intel](https://github.com/huggingface/optimum-intel.git) and [IPEX](https://github.com/intel/intel-extension-for-pytorch). | ||
|
||
### Installation | ||
|
||
Install using optimum-intel and ipex using: | ||
|
||
```bash | ||
pip install optimum[neural-compressor] | ||
pip install intel_extension_for_pytorch | ||
``` | ||
|
||
Please follow the installation instructions as specified below: | ||
|
||
* Install optimum-intel as shown [here](https://github.com/huggingface/optimum-intel). | ||
* Install IPEX as shown [here](https://intel.github.io/intel-extension-for-pytorch/index.html#installation?platform=cpu&version=v2.2.0%2Bcpu). | ||
|
||
### Embedding Models | ||
|
||
See a [usage example](/docs/integrations/text_embedding/optimum_intel). | ||
We also offer a full tutorial notebook "rag_with_quantized_embeddings.ipynb" for using the embedder in a RAG pipeline in the cookbook dir. | ||
|
||
```python | ||
from langchain_community.embeddings import QuantizedBiEncoderEmbeddings | ||
``` | ||
|
||
## Intel® Extension for Transformers (ITREX) | ||
|
||
All functionality related to the [intel-extension-for-transformers](https://github.com/intel/intel-extension-for-transformers). | ||
|
||
### Installation | ||
|
||
Install intel-extension-for-transformers. For system requirements and other installation tips, please refer to [Installation Guide](https://github.com/intel/intel-extension-for-transformers/blob/main/docs/installation.md) | ||
|
||
```bash | ||
pip install intel-extension-for-transformers | ||
``` | ||
|
||
Install other required packages. | ||
|
||
```bash | ||
pip install -U torch onnx accelerate datasets | ||
``` | ||
|
||
### Embedding Models | ||
|
||
See a [usage example](/docs/integrations/text_embedding/itrex). | ||
|
||
```python | ||
from langchain_community.embeddings import QuantizedBgeEmbeddings | ||
``` |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Intel® Extension for Transformers Quantized Text Embeddings\n", | ||
"\n", | ||
"Load quantized BGE embedding models generated by [Intel® Extension for Transformers](https://github.com/intel/intel-extension-for-transformers) (ITREX) and use ITREX [Neural Engine](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/llm/runtime/deprecated/docs/Installation.md), a high-performance NLP backend, to accelerate the inference of models without compromising accuracy.\n", | ||
"\n", | ||
"Refer to our blog of [Efficient Natural Language Embedding Models with Intel Extension for Transformers](https://medium.com/intel-analytics-software/efficient-natural-language-embedding-models-with-intel-extension-for-transformers-2b6fcd0f8f34) and [BGE optimization example](https://github.com/intel/intel-extension-for-transformers/tree/main/examples/huggingface/pytorch/text-embedding/deployment/mteb/bge) for more details." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stderr", | ||
"output_type": "stream", | ||
"text": [ | ||
"/home/yuwenzho/.conda/envs/bge/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", | ||
" from .autonotebook import tqdm as notebook_tqdm\n", | ||
"2024-03-04 10:17:17 [INFO] Start to extarct onnx model ops...\n", | ||
"2024-03-04 10:17:17 [INFO] Extract onnxruntime model done...\n", | ||
"2024-03-04 10:17:17 [INFO] Start to implement Sub-Graph matching and replacing...\n", | ||
"2024-03-04 10:17:18 [INFO] Sub-Graph match and replace done...\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"from langchain_community.embeddings import QuantizedBgeEmbeddings\n", | ||
"\n", | ||
"model_name = \"Intel/bge-small-en-v1.5-sts-int8-static-inc\"\n", | ||
"encode_kwargs = {\"normalize_embeddings\": True} # set True to compute cosine similarity\n", | ||
"\n", | ||
"model = QuantizedBgeEmbeddings(\n", | ||
" model_name=model_name,\n", | ||
" encode_kwargs=encode_kwargs,\n", | ||
" query_instruction=\"Represent this sentence for searching relevant passages: \",\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## usage" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"text = \"This is a test document.\"\n", | ||
"query_result = model.embed_query(text)\n", | ||
"doc_result = model.embed_documents([text])" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "yuwen", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.0" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.