Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FT] Propagate batch size control for vLLM backend #573

Open
alvin319 opened this issue Feb 18, 2025 · 1 comment
Open

[FT] Propagate batch size control for vLLM backend #573

alvin319 opened this issue Feb 18, 2025 · 1 comment
Labels
feature request New feature/request

Comments

@alvin319
Copy link

Issue encountered

With vLLM backend, currently there's no way for us to control the batch size defined in here and the vLLM model config does not have ways to determine a specific batch size. However, we can control the maximum number of sequences (batch size) in vLLM directly from examples such as this.

Solution/Feature

  • Propagate the max_num_seqs parameter into the initialization of the vLLM model.

Possible alternatives

  • Other alternatives are to implement batching ourselves, which is an overkill since the vLLM backend already supports that.
@alvin319 alvin319 added the feature request New feature/request label Feb 18, 2025
@alvin319
Copy link
Author

I'm happy to take this on, but I figured it warrants a discussion since vLLM leverages continuous batching, which behaves differently than the common understanding of a "fixed" batch size in the LightEval world.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature/request
Projects
None yet
Development

No branches or pull requests

1 participant