🚀 FAISS to the rescue! If you're tired of high costs from services like Pinecone, especially for large-scale vector storage, using FAISS on your own AWS instance can save your budget. For example, hosting your own FAISS on a t3.medium EC2 instance can drastically cut down costs compared to the recurring fees for retrieval and upserts on Pinecone. With FAISS, you get scalable, high-performance vector searching at a fraction of the price—no subscription costs or hidden fees! Why pay more when you can build smarter? 😎
This project provides a Flask-based API for managing FAISS vector stores with OpenAI embeddings. You can create vector stores, add vectors to them, and perform similarity searches.
- Create FAISS vector stores: Easily initialize new FAISS vector stores with a unique token.
- Add vectors: Use OpenAI embeddings to generate vectors and add them to the store.
- Similarity search: Perform top-K searches to find the most similar vectors based on input text.
- JSON serializable responses: Ensure clean and valid JSON responses for all API endpoints.
- Memory management: Vector stores are loaded and released from memory efficiently.
python3 main.py
- Endpoint:
/create/<name>
- Method:
POST
- Description: Creates a new FAISS vector store and returns a unique token.
Example request:
curl -X POST http://localhost:5000/create/my_store
Response:
{
"token": "unique_token"
}
- Endpoint:
/vector-token/index
- Method:
POST
- Description: Adds vectors to the store using an auth token, with automatic metadata handling.
Example request:
curl -X POST http://localhost:5000/vector-token/index \
-H "Content-Type: application/json" \
-d '{"auth_token":"your_token", "texts": "your input text", "subject": "Test Subject"}'
Response:
{
"message": "Vector added successfully"
}
- Endpoint:
/vector_token/search/<k>
- Method:
POST
- Description: Performs a top-K similarity search based on input query context.
Example request:
curl -X POST http://localhost:5000/vector_token/search/5 \
-H "Content-Type: application/json" \
-d '{"auth_token":"your_token", "input_query_context": "search text"}'
Response:
{
"results": [
{
"vector_id": 1,
"distance": 290884,
"metadata": {
"input_text": "I wana sleep 4",
"added_on": "2024-09-28-22-17",
"vector_id": 1
}
},
{
"vector_id": 0,
"distance": 999999,
"metadata": {
"input_text": "dummy_vector",
"added_on": "2024-09-28-22-17",
"vector_id": 0
}
}
]
}
- OpenAI Embeddings: The API uses OpenAI's embedding model (
text-embedding-ada-002
) to generate embeddings from text. - Vector Store Management: Each vector store is associated with a unique token. Stores are saved to disk and can be loaded or released from memory when needed.
- Indexing and Searching: FAISS is used for efficient vector storage and similarity searching.
- Optimize Memory Handling: Automatically release vector stores from memory after every operation.
- Distance Scaling: Convert distances to integers by multiplying by
1,000,000
to avoid floating-point precision issues in JSON responses. - Error Handling: Added error handling for invalid tokens and indices.
-
Clone the repository:
git clone https://github.com/wolfmib/alinex-faiss.git
-
Set up the environment:
- Install Python dependencies:
pip install -r requirements.txt
- Create a
.env
file and add your OpenAI API key.
- Install Python dependencies:
-
Run Tests:
python -m unittest discover -s tests
This project is licensed under the MIT License. Please remember to attribute contributions where applicable.
- Replace the placeholders like
your_token
with actual values. - The Markdown is structured to be readable and organized with clear sections for features, API endpoints, and usage examples.
Let me know if you need any further adjustments!