-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: AutoGen can't work with vLLM v0.5.1 #3120
Comments
Hey @tonyaw, this is a bit tricky in my opinion. I feel that it should return What do you think? |
@marklysze , I'm OK with both None and "[]" as long as it is aligned between agent framework(autogen) and LLM inference framework(vllm). :-) |
I also opened a same ticket to vllm. Let's align with vllm team for an agreement. :-) |
we integrated tools into vllm with function calling models. might be relevant: https://docs.rubra.ai/inference/vllm |
@sanjay920,
|
Is there any possible way to bypass this?... This really gives me headache... |
I can suggest a couple of approaches:
If someone wants to work on a PR that would help. |
what is latest on this ? |
Any update of this issue? Is there any WR now? |
@tonyaw can you check if the vllm integration issue is fixed for v0.4 |
@ekzhu, Thanks! |
I found the difference:
|
@tonyaw , there is not max_consecutive_auto_reply for agent. Each agent will generate only one auto response each time it is called. If you put an agent in a team, then it is up to the team to decide when to call which agent. |
Yes. |
@ekzhu ,
def communicate_with_assistant(self, *args, **kwargs):
# Check if there's an existing event loop
try:
# If called from an async context
loop = asyncio.get_event_loop()
if loop.is_running():
# Create a task and wait for it
future = asyncio.ensure_future(self.async_communicate_with_assistant(*args, **kwargs))
return asyncio.run_coroutine_threadsafe(future, loop).result()
# except RuntimeError:
except Exception as e:
self.logger.exception("get_event_loop failure:")
# If there's no event loop, create a new one
pass
# If called from a sync context
return asyncio.run(self.async_communicate_with_assistant(*args, **kwargs))
async def async_communicate_with_assistant(self, user_prompt, check_func=None, item_key=None, **kwargs):
"""
Arguments:
- `self`:
- `user_prompt`:
"""
self.logger.info(f"user_prompt={user_prompt}")
# Run the team and stream messages to the console.
stream = self.agent_team.run_stream(task=user_prompt)
answer_message = ""
async for chunk in stream:
self.logger.info(f"chunk={chunk}")
if type(chunk) is TaskResult:
answer_message = chunk.messages[1].content
self.logger.info(f"Got answer\n{answer_message}") if communicate_with_assistant is called the second time, I will get following error:
The fix is to recreate RoundRobinGroupChat each time: async def async_communicate_with_assistant(self, user_prompt, check_func=None, item_key=None, **kwargs):
"""
Arguments:
- `self`:
- `user_prompt`:
"""
self.logger.info(f"user_prompt={user_prompt}")
# <<<<< Add following line:
self.agent_team = RoundRobinGroupChat([self.assistant], max_turns=1)
# Run the team and stream messages to the console.
stream = self.agent_team.run_stream(task=user_prompt) I wonder if it is a right usage. If not, could you please help to provide a right one?
|
I think that if you are calling from async, you cannot use a new event loop -- I am not 100% sure. But a function should be either sync or async, not both. |
Will be available in the next release next week (#4924 ) |
Describe the bug
From vllm v0.5.0, it starts to support a new feature "OpenAI tools support named functions":
https://github.com/vllm-project/vllm/releases/tag/v0.5.0
After that, every message returned by vllm includes an empty "tools_call" list if user prompt doesn't intend to call a tool:
Steps to reproduce
See description.
Model Used
Llama3 70B.
It shall be a communication issue between vllm and autogen, and not related to LLM.
Expected Behavior
autogen can work with vllm v0.5.0 and later version with no problem.
Screenshots and logs
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: