-
Notifications
You must be signed in to change notification settings - Fork 18
Commit
Signed-off-by: greg pereira <[email protected]>
- Loading branch information
There are no files selected for viewing
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
RAG application with ILAB | ||
|
||
1. setup a vector DB (Milvus) | ||
|
||
Development story: | ||
0. Starting Goal: | ||
- Naive RAG no KG aided | ||
- Addition: | ||
Check failure on line 8 in milvus/seed/README.md GitHub Actions / markdown-lintTrailing spaces
|
||
1. identify what the model lacks knowledge in | ||
Check failure on line 9 in milvus/seed/README.md GitHub Actions / markdown-lintTrailing spaces
|
||
2. Can I use the interal trained model or do I have to use the HF model | ||
- | ||
Check failure on line 11 in milvus/seed/README.md GitHub Actions / markdown-lintTrailing spaces
|
||
|
||
- UI integration | ||
|
||
----------------------------------------------- | ||
|
||
variable definition | ||
class Config | ||
|
||
_identify_params, | ||
Check failure on line 20 in milvus/seed/README.md GitHub Actions / markdown-lintTrailing spaces
|
||
_llm_type, _extract_token_usage, | ||
Check failure on line 21 in milvus/seed/README.md GitHub Actions / markdown-lintTrailing spaces
|
||
|
||
Inherint in defining this spec which could eventually live as a contribution to langchain are some assumptions / questions I made: | ||
- Is the model serializable: Assumed no | ||
- Max tokens for merlinite and granite: Both assumed 4096 | ||
- Does this model have attention / memmory? | ||
- Does these models have a verbosity option for output? | ||
- Recomended default values: | ||
- | ||
Check failure on line 29 in milvus/seed/README.md GitHub Actions / markdown-lintTrailing spaces
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
import requests | ||
import json | ||
import os | ||
from dotenv import load_dotenv | ||
|
||
load_dotenv() | ||
|
||
# manage ENV | ||
model_endpoint=os.getenv('MODEL_ENDPOINT') | ||
if model_endpoint == "": | ||
model_endpoint = "http://localhost:8001" | ||
|
||
model_name=os.getenv('MODEL_NAME') | ||
if model_name == "": | ||
model_name = "ibm/merlinite-7b" | ||
|
||
model_token=os.getenv('MODEL_TOKEN') | ||
|
||
headers = { | ||
"Content-Type": "application/json", | ||
"Authorization": f"Bearer {model_token}" | ||
} | ||
|
||
data = { | ||
"model": model_name, | ||
"messages": [ | ||
{"role": "system", "content": "your name is carl"}, | ||
{"role": "user", "content": "what is your name?"} | ||
], | ||
"temperature": 1, | ||
"max_tokens": 1792, | ||
"top_p": 1, | ||
"repetition_penalty": 1.05, | ||
"stop": ["<|endoftext|>"], | ||
"logprobs": False, | ||
"stream": False | ||
} | ||
|
||
response = requests.post(model_endpoint, headers=headers, data=json.dumps(data), verify=False) | ||
print(response.json()) |