Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ReAct][Robustness++] Avoid "Reached max iterations" errors by telling Agent how many steps are allowed #12426

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

tslmy
Copy link
Contributor

@tslmy tslmy commented Mar 29, 2024

Description

Fixes #12209 . TL;DR: Agent can avoid "Reached max iterations" problems if it has been told how many steps it can try.

Before After
Screenshot 2024-03-29 at 16 30 37 Screenshot 2024-03-29 at 16 22 25
Agent keeps on trying and gets killed at the framework level. Agent knows when to give up.

New Package?

Not a new package.

Version Bump?

Not applicable.

Type of Change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Added new unit/integration tests
  • Added new notebook (that tests end-to-end)
  • I stared at the code and made sure it makes sense

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran make format; make lint to appease the lint gods

@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Mar 29, 2024
@logan-markewich
Copy link
Collaborator

@tslmy just curious, do you think this will influence the agent to run for all steps, rather than stopping when it is ready? Or what has your experience been like?

@tslmy
Copy link
Contributor Author

tslmy commented Apr 1, 2024

@logan-markewich good question. I have not yet encountered a case where

  • the Agent used to consume just one or two steps to complete a task before this PR and
  • started to consume all allowances with this PR,

only cases where

  • it used to consume too many steps before the PR and
  • started capping at its allowed steps with the PR.

But I think that really owes to the lack of cases I tested & the low intelligence of the 7B LLMs I tried locally.

@logan-markewich
Copy link
Collaborator

@tslmy sorry for any delay here. I'm a tad worried about merging a change like this to the prompt. Would it better to use the error handling we added before to handle when max iterations is reached?

@tslmy
Copy link
Contributor Author

tslmy commented Apr 4, 2024

No worries at all @logan-markewich . I barely have time to attend to my side projects during weekdays anyway.

Let me try to convince you that this must go into the prompt. The whole point is to let the Agent know that it must take less than N steps to complete a task at the beginning, so that it can better plan out the procedure.

The error handling won't cut it. Why? Let's suppose the Agent is at its (N+1)-th step out of N allowed steps. We raise an exception. The error handler catches it. The handler offers Agent a 2nd chance to fix it. The Agent tries something different, which would be the (N+2)-th step. We raise an exception. The error handler catches it. The handler offers Agent a 3rd chance to fix it... You see where this is going ;)

I'm actually kinda OK to remove the "(X of Y steps)" part at every "Thought:" line, if that's what making you uncomfortable, because it only serves as a reminder of the LLM that "hey, you've consumed X steps out of your allowance of Y, so you gotta hurry up in your next Y-X steps".

@logan-markewich
Copy link
Collaborator

@tslmy Yea, you nailed it, the (1 of x) in the thought line was making me a tad nervous 👀

@tslmy tslmy force-pushed the step-counting-prompt branch from 975452a to 3680116 Compare April 10, 2024 02:37
@tslmy
Copy link
Contributor Author

tslmy commented Apr 10, 2024

@logan-markewich no problem, Logan. I've updated the PR, removing the "(X of Y steps)" part at every "Thought:" line but keeping the "you should cap it at X steps" line. Does it look good to you?

In my own projects, I always override the default prompt anyway, so no hard feelings.

@CP-Geeky
Copy link

CP-Geeky commented Dec 30, 2024

@tslmy, can you describe the behavior of agent if the max_iteration value is being set to 5?

As per my understanding,
allowance = (max_iteration - 1)/2;
With this allowance will be 2

Then out of 5 steps, agent will take

  • 2 iterations to execute/find the answer
  • 1 iteration it takes for the observation step
  • 2 step will be for allowance means agent will prepare the final answer based on his findings.

Is is the way it will work?

@tslmy
Copy link
Contributor Author

tslmy commented Dec 30, 2024

Hi @CP-Geeky ,

When max_iterations is set to 5, allowance will be (5-1) // 2 = 2. The agent will see that it is only allowed to take 2 steps.

The agent will say,

Thought: (Step 1 out of 2) I need to do X.
Action: ...
Action Input: ...

which counts as one iteration.

This action will yield an output. The agent will take a 2nd iteration to parse it:

Observation: Y

The agent then takes a 3rd iteration to come up with a follow-up action:

Thought: (Step 2 out of 2) I need to do X.
Action: ...
Action Input: ...

This action will yield an output. The agent will take a 4th iteration to parse it:

Observation: Y

The LLM is then invited to think over the current situation, which would be our 5th (and final) iteration. In this iteration, the LLM should notice (from the "step A out of B" prefixes it has been prepending to the thoughts) that it has run out of allowed steps. Thus, it should give up:

Thought: I see that doing X has given me Y. I would have continued to do Z, but I'm running out of allowance of steps, so I'll just give up.
Answer: I don't know how to answer your question. The best I can tell you is that by doing X I know Y. You might wanna do Z yourself.

Note that some LLMs (and sometimes the same LLM) will treat the final "thought-answer" pair as a step, just like the "after" screenshot I pasted to the PR descrption. That's fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: ReAct Agent: Inform the Agent that it's running out of steps
3 participants