-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding chat history to RAG app and refactor to better utilize LangChain #648
base: main
Are you sure you want to change the base?
Conversation
…eep track of and retrieve chat history from Cloud SQL. main.py - removed old langchain and logic to retrieve context. replaced with new chain from rag_chain.py. Introduced browser session with 30 minute ttl. Storing session ID in the session cookie. Session ID is then used to retrieve chat history. Chat history is cleared when timeout is reached. cloud_sql.py - now includes a method to create a PostgresEngine for storing and retrieving history, plus a CustomVectorStore to perform the query embedding and vector search. Old code paths no longer needed were removed. rag_chain.py - contains helper method create_chain to create, update and delete the end-to-end RAG chain with history. various tf files: increased max input and total tokens on HF TGI for mistral. threadded through some parameters needed to instantiate the PostgresEngine. requirements.txt - added some dependencies needed for langchain
/gcbrun |
Reverted breaking change to env var
/gcbrun |
1 similar comment
/gcbrun |
* Working on improvements for rag application: - Working on missing TODO - Fixing issue with credentials - Refactoring vector_storages so you can add different vector storages TODO: Vector Storage factory - Unit test will be added on future PR * Updating changes with db * refactoring app so can be executed using gunicorn * refactory of the code as flask application package * Fixing Bugs - Reviewing issue with IPtypes, currently the fix is to validate if there's an development environment so a public cloud_sql instance can be use. - Fixing issue with Flask App Factory
/gcbrun |
* Working on improvements for rag application: - Working on missing TODO - Fixing issue with credentials - Refactoring vector_storages so you can add different vector storages TODO: Vector Storage factory - Unit test will be added on future PR * Updating changes with db * refactoring app so can be executed using gunicorn * refactory of the code as flask application package * Fixing Bugs - Reviewing issue with IPtypes, currently the fix is to validate if there's an development environment so a public cloud_sql instance can be use. - Fixing issue with Flask App Factory * Working on Custom HuggingFace interface - Adding a custom chat model to send request to HuggingFace TGI API - Applying formatting to code.
* Working on improvements for rag application: - Working on missing TODO - Fixing issue with credentials - Refactoring vector_storages so you can add different vector storages TODO: Vector Storage factory - Unit test will be added on future PR * Updating changes with db * refactoring app so can be executed using gunicorn * refactory of the code as flask application package * Fixing Bugs - Reviewing issue with IPtypes, currently the fix is to validate if there's an development environment so a public cloud_sql instance can be use. - Fixing issue with Flask App Factory * Working on Custom HuggingFace interface - Adding a custom chat model to send request to HuggingFace TGI API - Applying formatting to code. * Improving the CloudSQL vector vector_storage
/gcbrun |
/gcbrun |
/gcbrun |
…atform/ai-on-gke into rag-langchain-chat-history
/gcbrun |
@german-grandas, in the snapshot of "Previous RAG without Chat History", I saw there were some error thrown. I tried with the latest code on the main branch (i.e. Previous RAG without Chat History) and didn't see any error. Do you know what was going wrong? Below is my snapshot: |
@gongmax In the example I'm deploying with the image Which image did you use on the test you mentioned? |
I didn't make any change, I just pulled the main branch and deploy the whole application. The image is same as you mentioned above |
It's odd, reviewing the logs I see a warning from the database and that's what the frontend is showing as part of the prompt response. Not sure how to track this, any idea? |
… into rag-langchain-chat-history
/gcbrun |
… into rag-langchain-chat-history
/gcbrun |
1 similar comment
/gcbrun |
/gcbrun |
1 similar comment
/gcbrun |
/gcbrun |
Comments about improving the quality of the answer generation: Currently using the model
This is the answer for the previous prompt: In a further exercise a most robust model like Might be worthing exploring the migration to VertexAI instead of continue using open source models from huggingface like Mistral. |
Can you adjust the length of the chat history included in the context and see how it can impact the response? |
There's not any improvement with the comments you suggested, I sustain the hypothesis about the LLM not able to support the context or respond to the given prompt. Regarding the LLM including 'AI' at the beginning of the response, looks like is the way how the LLM generates the answer, I include instructions in the prompt to not do that and the LLM continues generating content as it was a conversational agent and not a generation LLM. |
… into rag-langchain-chat-history
/gcbrun |
Quote from chat with @german-grandas to keep track: "I made an update into the inference service so the LLM can support the generation of a longer answer. that fixed the issue with the short generation of the rag system when you submitted a question." |
See commit log for full description. tl;dr: added chat history to rag-frontend app.