You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Have you tested the performance when CoT/reasoning models such as DeepSeek-Distill-Series are served by LServe? In the LServe paper, it's mentioned that LServe aims at accelerating both prefilling and decoding stage. But it seems that there are no results comming from reasoning benchmark such as AIME 2024, MATH-500 etc.
Thanks for your reply!
The text was updated successfully, but these errors were encountered:
Hi! Thanks for your great work!
Have you tested the performance when CoT/reasoning models such as DeepSeek-Distill-Series are served by LServe? In the LServe paper, it's mentioned that LServe aims at accelerating both prefilling and decoding stage. But it seems that there are no results comming from reasoning benchmark such as AIME 2024, MATH-500 etc.
Thanks for your reply!
The text was updated successfully, but these errors were encountered: