-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: MessageTokenLimiter ignored for output of Tools #2469
Comments
Thanks. If you make a PR, please add @gagb and @WaelKarkoub as reviewers. |
Hi @daanraman, thanks for your feedback. I believe the accuracy of tool outputs is crucial, and truncating might omit valuable information. This issue might reflect a design decision rather than a bug but there are possible solutions. I'm working on a PR that applies LLM lingua for text compression, which might be a better fit for managing tool outputs but not certain of its effectiveness. We could also consider a new transform specifically for tool outputs that truncates differently, e.g. truncate from the middle, to preserve more context. Let me know what you think |
Hi @sonichi / @WaelKarkoub - thanks for the quick feedback, appreciated. I understand the reasoning behind not truncating tool output. If that's a design choice though, I think it's confusing that in the output, it suggests that the tokens were truncated, while that doesn't seem to be the case. The output of the tool itself should be feasible to fit into the context window of the LLM I am using - however, the main reason why I was trying to truncate the output of the Tool is to avoid filling up the history sent to the LLM with the large Tool output, which is not required for later steps to understand the context of the conversation (they should just base themselves on the output of the Agent that used the tool to come up with an answer). So my question then is: 1) is Tool output of previous steps included in the history window in later steps in the conversation, or are these excluded? and 2) if the answer to the first question is that they are in fact included in later steps (and thus fill up the context window with Tool output), is using Nested Chats a way to "hide" the Tool output of previous steps ? |
@daanraman I see where the confusion lies now, the logs indicate truncation of message content without accounting for tool outputs. I'll open a PR to clarify the logging for this transform. Out of curiosity, are you applying the
If possible, consider creating a custom transform to extract essential information from tool outputs. This approach could add value to AutoGen. Would you be interested in collaborating on this? |
@WaelKarkoub thanks for the time & feedback. Correct, I was applying the Still very new to autogen (moved away from crew.ai yesterday) but am liking it very much so far - great documentation, examples, and in general the agents behave better by default I feel (better system prompts & history management). The group chat features are great too. I will certainly consider contributing once I am a bit more familiar with both the framework & the codebase! |
Describe the bug
MessageTokenLimiter works as expected for output not generated by Tools.
However, when Tools are used, this appears not to be the case.
Steps to reproduce
Model Used
gpt-3.5-turbo
Expected Behavior
The truncated output should be sent to the GPT model, not the entire output, which triggers a rate limit
Screenshots and logs
In the screenshots, I show that even though the print statement mentions that the tokens were limited, it appears that the non-truncated output is sent to the GPT mode.
Output shows that the output of my tool is correctly being truncated

This seems to be ignored when calling the LLM though, showing a rate limit

Additional Information
No response
The text was updated successfully, but these errors were encountered: