Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor of error handling #4575

Draft
wants to merge 23 commits into
base: main
Choose a base branch
from
Draft

Refactor of error handling #4575

wants to merge 23 commits into from

Conversation

rbren
Copy link
Collaborator

@rbren rbren commented Oct 26, 2024

End-user friendly description of the problem this fixes or functionality that this introduces

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below
    no changelog

Give a summary of what the PR does, explaining any non-trivial design decisions
In https://github.com/All-Hands-AI/OpenHands/pull/4573/files @xingyaoww found that a FatalErrorObservation was being dropped back into the event stream from a handler 🙈

This should cause a hard error in those cases


Link of any specific issues this addresses

@rbren rbren marked this pull request as draft October 26, 2024 03:32
@rbren rbren changed the title prevent evenstream from looping back on itself Minor refactor of agent controller Oct 26, 2024
ErrorObservation(f'{message}:{detail}'), EventSource.USER
detail += '\nPlease check your credentials. Is your API key correct?'
await self.event_stream.async_add_event(
ErrorObservation(f'{message}:{detail}'), EventSource.AGENT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method was initially meant only for informing the user in the UI. But it had such a good name, that we felt like it should be used almost like logging errors. 😂

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂

TBH I don't think an AgentController should know anything about a UI

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's not that it knows it, strictly speaking, just the ErrorObs created here has a message intended for the chat box / end user's eyes.

And by this time, it doesn't have anymore the info about the exception, except... it did, when it got it as param for some exception info intended for evals.

Copy link
Collaborator

@enyst enyst Oct 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you're right, and your way of looking at it might just solve this little strange thing. How about:

  • we don't do at all these UI-intended lines of code here, in the controller
  • create an ErrorObservation
    • with all info potentially needed, e.g. for the backend / LLM
    • including exception info
  • put it in the stream
  • let the agent session interpret it and send it over to the UI with whatever info it makes of it
    • e.g. a dict with the right source, a certain message, perhaps remove stuff it doesn't want. 🤔

@@ -12,6 +12,7 @@ class ErrorObservation(Observation):
E.g., Linter error after editing a file.
"""

fatal: bool = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I thought adding a FatalErrorObservation is a good thing, is that we treat ErrorObs too much the same, when there are essential differences, such as:

  • some should be in agent's history, some not;
  • some should have a user-friendly message, some should have an LLM-friendly content.

In this case, it seems we can also add:

  • some should stop the controller, some should not.

Not distinguishing them well right now, is IMHO a pretty big part of what is confusing in our current error handling. The alternative might be to go back to ErrorObs are in one category, and exceptions in the other... I guess.

Copy link
Collaborator Author

@rbren rbren Oct 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I do think it's important to distinguish them (hence the fatal=True) but we end up having to check isinstance(ErrorObs) or isinstance(FatalErrorObs), and it seems ripe for someone forgetting to handle one or the other, or for the list of error types to grow indefinitely

In general I want to pare down the number of unique things that could end up in the EventStream, because as we add long-term state, we're going to have to ensure old EventStreams still deserialize properly, and the smaller the footprint the better.

some should stop the controller, some should not.

This seems like the only thing fatal should really distinguish. I.e. if you hit a fatal Error, the event stream stops until there's a user interaction. I see what you mean about one being user-facing, and the other being LLM-facing though...

Interestingly, as I look through a lot of the fatal ones, they seem like things the agent should be able to recover from (e.g. submitting a non-existent action)

Will think more about this

Copy link
Collaborator

@enyst enyst Oct 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, yes, good points. Re: deserialization, I just hit something similar.

About this bit:

I see what you mean about one being user-facing, and the other being LLM-facing though...

What do you think about adding a "cause" or "error" attr to the ErrorObs, with the underlying exception string, at least, if not more of the underlying exception? (We're already using a "cause" attribute, maybe we could still rename that to "parent", but yep deserialization)

So that the agent has an ErrorObs with that info.

I seem to recall having tried it, though, but I didn't like the result, because we may want to be a bit careful to make it very easy for developers to use in all the places where an ErrorObs is created or where that underlying exception is handled. Otherwise it will be forgotten, as you just said. 😅 My attempt didn't achieve that IMO. That was a while ago, maybe it's worth another look if there is some reasonable way to do it?

Copy link
Collaborator

@enyst enyst Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note, actually, while the isinstance() issue is a huge red flag, that one could be also addressed by subclassing... 🤔

# it's important to set the state before adding the error event, so that metrics sync properly
await self.set_agent_state_to(new_state)
await self.event_stream.async_add_event(
ErrorObservation(message), EventSource.AGENT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure about changing USER to AGENT? We were just discussing about this.

Copy link
Collaborator

@enyst enyst Oct 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not matter much with tool use, because that one is role=tool! Or I hope it stays role tool, anyway. (I'm not sure yet if/how all tool use APIs support the role)

Edited to add: well, it will matter for those who don't support it.

Copy link
Collaborator

@enyst enyst Oct 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! For the UI we need it AGENT, right? Well... So there is a hybrid UI/LLM in ErrorObservations and it's hitting us. 🤔

We could adjust it in the agent for obs... but if that's what you want to go with, please lets add some comment. We will forget, because some of us reflexively assume that the LLM's way is the way things should be. 😂

Alternatively, maybe that idea linked above, to have in our code an "environment" (or "tool") source (e.g. EventSource.ENV), is not too bad. Each component or agent would interpret it in its own way.

@ngduyanhece
Copy link

PR Reviewer Comments

After analyzing the pull request, here are my findings:

Overall Feedback:

The minor refactor of the agent controller is a positive step towards improving the codebase's maintainability and clarity. The changes made to handle errors more effectively and the removal of FatalErrorObservation in favor of a more general ErrorObservation are commendable. However, there are opportunities to enhance the code further, particularly in terms of consistency and error handling.

Score: 87/100

Labels: Bug fix, Enhancement

Code Suggestions:

  1. File: openhands/runtime/utils/bash.py

    • Suggestion Content: Ensure that the run method consistently returns the same type of observation to avoid type-related issues.
    • Relevant Line: + def run(self, action: CmdRunAction) -> CmdOutputObservation | ErrorObservation:
    • Existing Code:
      def run(self, action: CmdRunAction) -> CmdOutputObservation | FatalErrorObservation:
    • Improved Code:
      def run(self, action: CmdRunAction) -> CmdOutputObservation | ErrorObservation:
  2. File: openhands/runtime/utils/edit.py

    • Suggestion Content: Consider using a consistent naming convention for error handling to improve readability and maintainability.
    • Relevant Line: + return ErrorObservation(...)
    • Existing Code:
      return ErrorObservation(
          f'Fatal Runtime in editing: Expected FileWriteObservation, got {type(obs)}: {str(obs)}',
      )
    • Improved Code:
      return ErrorObservation(
          f'Error in editing: Expected FileWriteObservation, got {type(obs)}: {str(obs)}',
      )
  3. File: tests/unit/test_agent_controller.py

    • Suggestion Content: Ensure that the test names clearly reflect the functionality being tested for better clarity.
    • Relevant Line: +async def test__react_to_error(mock_agent, mock_event_stream):
    • Existing Code:
      async def test_report_error(mock_agent, mock_event_stream):
    • Improved Code:
      async def test_react_to_error(mock_agent, mock_event_stream):

The minor refactor of the agent controller is a positive step! Here are some suggestions:

  1. Ensure that the run method consistently returns the same type of observation to avoid type-related issues.
  2. Consider using a consistent naming convention for error handling to improve readability.
  3. Ensure that test names clearly reflect the functionality being tested for better clarity.

@@ -275,7 +275,7 @@ def _continue_bash(
output += '\r\n' + bash_prompt
return output, exit_code

def run(self, action: CmdRunAction) -> CmdOutputObservation | FatalErrorObservation:
def run(self, action: CmdRunAction) -> CmdOutputObservation | ErrorObservation:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Ensure consistent return types in the run method.

Existing Code:

def run(self, action: CmdRunAction) -> CmdOutputObservation | FatalErrorObservation:

Improved Code:

def run(self, action: CmdRunAction) -> CmdOutputObservation | ErrorObservation:

Details:
Ensure that the run method consistently returns the same type of observation to avoid type-related issues.

@@ -214,8 +213,9 @@ def edit(self, action: FileEditAction) -> Observation:
if isinstance(obs, ErrorObservation):
return obs
if not isinstance(obs, FileWriteObservation):
return FatalErrorObservation(
f'Fatal Runtime in editing: Expected FileWriteObservation, got {type(obs)}: {str(obs)}'
return ErrorObservation(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Use consistent naming for error handling to improve readability.

Existing Code:

return ErrorObservation(
    f'Fatal Runtime in editing: Expected FileWriteObservation, got {type(obs)}: {str(obs)}',
)

Improved Code:

return ErrorObservation(
    f'Error in editing: Expected FileWriteObservation, got {type(obs)}: {str(obs)}',
)

Details:
Consider using a consistent naming convention for error handling to improve readability and maintainability.

@@ -87,7 +87,7 @@ async def test_on_event_change_agent_state_action(mock_agent, mock_event_stream)


@pytest.mark.asyncio
async def test_report_error(mock_agent, mock_event_stream):
async def test__react_to_error(mock_agent, mock_event_stream):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Ensure test names reflect the functionality being tested.

Existing Code:

async def test_report_error(mock_agent, mock_event_stream):

Improved Code:

async def test_react_to_error(mock_agent, mock_event_stream):

Details:
Ensure that the test names clearly reflect the functionality being tested for better clarity.

enyst and others added 2 commits October 27, 2024 02:52
await self.event_stream.async_add_event(
ErrorObservation(message), EventSource.AGENT
)
raise e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is all that's left of this method... it seems we don't really need it, maybe we can set the agent_state in the catch block 🤔
Is this state ever different than ERROR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one (one...) PAUSED

@rbren rbren changed the title Minor refactor of agent controller Refactor of error handling Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants