You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With vLLM backend, currently there's no way for us to control the batch size defined in here and the vLLM model config does not have ways to determine a specific batch size. However, we can control the maximum number of sequences (batch size) in vLLM directly from examples such as this.
Solution/Feature
Propagate the max_num_seqs parameter into the initialization of the vLLM model.
Possible alternatives
Other alternatives are to implement batching ourselves, which is an overkill since the vLLM backend already supports that.
The text was updated successfully, but these errors were encountered:
I'm happy to take this on, but I figured it warrants a discussion since vLLM leverages continuous batching, which behaves differently than the common understanding of a "fixed" batch size in the LightEval world.
Issue encountered
With vLLM backend, currently there's no way for us to control the batch size defined in here and the vLLM model config does not have ways to determine a specific batch size. However, we can control the maximum number of sequences (batch size) in vLLM directly from examples such as this.
Solution/Feature
max_num_seqs
parameter into the initialization of the vLLM model.Possible alternatives
The text was updated successfully, but these errors were encountered: