Replies: 13 comments 3 replies
-
🤖 Hey @AndreaBrg, great to see you diving into another challenge with LangChain! Always a pleasure to have you back. How's everything going? Based on the information you've provided and the context from the LangChain repository, it seems you want to extract the documents retrieved by the The To extract the documents retrieved by the However, it seems like you want to access the source documents directly from the agent's response. In that case, you might find the discussion on the issue RetrievalQA chain return source_documents when using it as a Tool for an Agent helpful. In this issue, the user was able to return the source documents by setting Here is the relevant code from that discussion: chain = RetrievalQA.from_chain_type(llm=llm, chain_type='stuff', retriever=retriever, return_source_documents=True, verbose=True, input_key="question") And then using this chain in the tool: Tool(
name="Search summaries",
func=lambda query: chain({"question": query}),
description="useful for when you need to answer questions. Input should be a fully formed question."
) Please note that this might make the chain look a bit clunky, but it should allow you to access the source documents directly from the agent's response. I hope this helps! If you have any further questions, feel free to ask. Sources
|
Beta Was this translation helpful? Give feedback.
-
@dosu-bot so there is no way to extract the source documents used to generate the output natively when using create_retriever_tool? |
Beta Was this translation helpful? Give feedback.
-
🤖 Based on the current implementation of the LangChain framework, there is no built-in method to extract the source documents used by the The Similarly, the Therefore, if you need to access the source documents used by the Sources
|
Beta Was this translation helpful? Give feedback.
-
@dosu-bot Could you suggest how to change my code to support the extraction of source documents? from langchain import hub
from langchain.agents import create_react_agent, create_openai_tools_agent, AgentExecutor
from langchain.tools.retriever import create_retriever_tool
from langchain_core.language_models import BaseLanguageModel
from langchain_core.memory import BaseMemory
from langchain_core.vectorstores import VectorStoreRetriever
from brainybot.llm.chain.settings import ChainAgentSettings
def make_agent_tools(vector_store_retriever: VectorStoreRetriever):
# TODO make this better
tool = create_retriever_tool(
vector_store_retriever,
name="search_vinodoo_products",
description="Searches and returns information about the product in the Vinodoo's catalog.",
)
return [tool]
def create_conversation_chain(llm_model_type: str, llm_model: BaseLanguageModel,
vector_store_retriever: VectorStoreRetriever, memory: BaseMemory,
agent_settings: ChainAgentSettings | None) -> AgentExecutor:
# Create agent tools
tools = make_agent_tools(vector_store_retriever)
# Based on the llm model type we create a general or a specific agent
match llm_model_type.lower():
case "openai":
# TODO make better prompt instead of hub.pull("hwchase17/openai-tools-agent")
agent = create_openai_tools_agent(llm_model, tools, hub.pull("hwchase17/openai-tools-agent"))
if agent_settings is None:
agent_settings = ChainAgentSettings.from_openai_default()
case _:
# TODO: Get the prompt to use - you can modify this!
agent = create_react_agent(llm_model, tools, hub.pull("hwchase17/react-chat"))
if agent_settings is None:
agent_settings = ChainAgentSettings.from_default()
# Return the agent
return AgentExecutor(agent=agent, tools=tools, memory=memory,
max_iterations=agent_settings.max_iterations,
max_execution_time=agent_settings.max_execution_time,
early_stopping_method=agent_settings.early_stopping_method,
handle_parsing_errors=agent_settings.handle_parsing_errors) |
Beta Was this translation helpful? Give feedback.
-
🤖 To extract the source documents used by the def _get_relevant_documents(
query: str,
retriever: BaseRetriever,
document_prompt: BasePromptTemplate,
document_separator: str,
callbacks: Callbacks = None,
) -> (str, list):
docs = retriever.get_relevant_documents(query, callbacks=callbacks)
return document_separator.join(
format_document(doc, document_prompt) for doc in docs
), docs
async def _aget_relevant_documents(
query: str,
retriever: BaseRetriever,
document_prompt: BasePromptTemplate,
document_separator: str,
callbacks: Callbacks = None,
) -> (str, list):
docs = await retriever.aget_relevant_documents(query, callbacks=callbacks)
return document_separator.join(
format_document(doc, document_prompt) for doc in docs
), docs Then, in the def create_retriever_tool(
retriever: BaseRetriever,
name: str,
description: str,
*,
document_prompt: Optional[BasePromptTemplate] = None,
document_separator: str = "\n\n",
) -> Tool:
document_prompt = document_prompt or PromptTemplate.from_template("{page_content}")
func = partial(
_get_relevant_documents,
retriever=retriever,
document_prompt=document_prompt,
document_separator=document_separator,
)
afunc = partial(
_aget_relevant_documents,
retriever=retriever,
document_prompt=document_prompt,
document_separator=document_separator,
)
return Tool(
name=name,
description=description,
func=lambda query: func(query)[0], # Only return the formatted documents
coroutine=lambda query: afunc(query)[0], # Only return the formatted documents
args_schema=RetrieverInput,
original_docs_func=lambda query: func(query)[1], # Add a new method to get the original documents
original_docs_coroutine=lambda query: afunc(query)[1], # Add a new method to get the original documents
) Now, when you create a tool using Please note that this modification is not part of the official LangChain repository and is a custom solution to your specific requirement. It's always recommended to keep your LangChain packages up-to-date with the official releases for the best compatibility and support. For more details, you can refer to the LangChain retriever tool source code. Sources
|
Beta Was this translation helpful? Give feedback.
-
@AndreaBrg Pass an additional argument like this: |
Beta Was this translation helpful? Give feedback.
-
Hi! Were you be able to figure it out? |
Beta Was this translation helpful? Give feedback.
-
@zmartin1 From which place are you following the example? It is not working for me. I'm getting an empty output |
Beta Was this translation helpful? Give feedback.
-
Hi! Any updates on this issue? I followed @zmartin1 instructions and I'm now able to retrieve documents/metadata from the documents, but this is not always the case as the info retrieved is generated by the LLM (sometimes it returns the values I want, sometimes it returns nothing). Any help with this? |
Beta Was this translation helpful? Give feedback.
-
@dosu-bot please read my code segment and correct it. It error but i don't know why it error and how i can correct it from langchain.tools.retriever import create_retriever_tool tool = create_retriever_tool( Traceback (most recent call last): |
Beta Was this translation helpful? Give feedback.
-
#Pass the option response_format as "content_and_artifact" while creating the tool below:
|
Beta Was this translation helpful? Give feedback.
-
This is another version that doesn't use response_format="content_and_artifact" but its default version response_format="content" and this could be useful when you want your RAG agent to pick out relevant metadata [can be URLS, etc.,] to build context along with the Document -page_content. Model: GPT-4o-mini
|
Beta Was this translation helpful? Give feedback.
-
Checked other resources
Commit to Help
Example Code
Description
I'd like to extract the documents retrieved by
create_retriever_tool
when this is used to create an OpenAI agent withcreate_openai_tools_agent
.Currently, when invoked the agent returns:
How can I retrieve the documents used to create the output?
Thanks
System Info
langchain==0.1.6
langchain-community==0.0.19
langchain-core==0.1.22
langchain-openai==0.0.5
langchainhub==0.1.14
langsmith==0.0.87
Beta Was this translation helpful? Give feedback.
All reactions