-
Notifications
You must be signed in to change notification settings - Fork 840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is implementation of Part 7: Decomposition really the implementation of Least-To-Most prompting method? #4
Comments
Yes, you are correct! Decomposition in general means answering each sub-question. We can either do them in parallel, as shown. Or recursively, as shown in least-to-most. My code example shows parallel. Do you think that showing the recursive approach is of high interest? If so, I will add it! |
Thank you for your response! In my opinion, probably it would be great to implement this approach which is close to least-to-most prompting because when authors compared CoT with their method, they mentioned that CoT solves sub-questions independently, and also the dependent nature of their approach improves result (as I understand). But I don't know if it is possible to automatically prepare representative examples as described in the paper:
As I understand, such prompt tuning with optimal examples selection and generation are solved by frameworks like DSPy? By the way how do you think is the following method viable (or probably it already exists) - what if we generate sub-questions and either calculate some aggregated embedding for them (for example, mean) or group sub-questions by embeddings using similarity threshold, after that retrieve the documents for each aggregated embedding? Thank you. |
Good feedback! I just added it :) Please keep the helpful feedback coming! |
Also note: the approach in the notebook is also similar https://arxiv.org/pdf/2212.10509.pdf
But, it does not pre-generate a set of sub-questions up front. The example I show in the notebook combines ideas from
I find this to be fairly easy to follow. But open to suggestions! |
Hi @rlancemartin, Thank you so much for your implementation and explanation, it is very useful! Thank you. |
I've got other question is example from the source a good example for decomposition? We've got the question: "What are the main components of an LLM-powered autonomous agent system?" And 3 answers based on RAG DECOMPOSITION: Question: 1. What is LLM technology and how does it work in autonomous agent systems? Question: 2. What are the specific components that make up an LLM-powered autonomous agent system?
These components work in conjunction with the LLM technology, which serves as the core controller in autonomous agent systems, to enable effective decision-making and problem-solving capabilities. Question: 3. How do the main components of an LLM-powered autonomous agent system interact with each other to enable autonomous behavior?
These components work in conjunction with the LLM technology, which serves as the core controller in the autonomous agent system. The planning component provides a structured approach to task execution, while reflection and refinement enable the agent to adapt and improve over time. Together, these components enable the autonomous agent to make effective decisions, solve complex problems, and exhibit autonomous behavior. And according to the code, answer for Question 3 is the final answer
Is this the answer for the original question? "What are the main components of an LLM-powered autonomous agent system?" FOR ME NOT, it is the answer for 3rd question from the decomposition of the question - not the original one. I tested the question from the paper: https://arxiv.org/pdf/2205.10625 And the example makes more sense. Because last sentence is a summary question.
|
Hi @rlancemartin,
I have a question about the implementation of Part 7 where you are referring to the paper about Least-To-Most Prompting from Google.
You mentioned that this method can be implemented by processing all generated queries independently, but in the paper the authors wrote about sequential solving the next questions based on the already solved question-answer pairs on the previous stage (Figure 1 from the paper). Based on that it seems that this method from Google can't be parallelized in its original version because each subsequent step depends on all the previous sub-questions and answers.
At 2:00 you mentioned that we can answer sub-questions in isolation, but at 2:20 you told that previous solution will be injected in the context of the next sub-question, but further implementation does not include the previous Q&A pairs to the next question's context.
Do these statements contradict each other?
Do I understand correctly that your implementation is more similar to multi-query approach but instead of all unique documents for all sub-questions you generate Q&A pairs based on that documents which you are using as the context for the final answer, and other logic is exactly the same like in multi-query, so it can be parallelized (like in multi-query section) instead of subsequent processing using
for
loop?Also as I understand the core part for both staged of Least-To-Most prompting technique are the examples, and other techniques are optional for it (from the caption of Figure 1 and Section 2 of the paper):
subproblem. The demonstration examples for each stage’s prompt are omitted in this illustration.
Also I found the diagram of the implementation of Multi-Query RAG in README of the LCEL Teacher repository (top part of the screenshot) but as I understand this diagram describes exactly what you implemented in Part 7 here.
Could you please share your thoughts about that? Because I am little confused with this terminology.
Thank you.
The text was updated successfully, but these errors were encountered: