Skip to content

Add sleep mode feature for Ascend NPU #416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 7, 2025

Conversation

celestialli
Copy link

@celestialli celestialli commented Mar 28, 2025

What this PR does / why we need it?

This PR adds sleep mode feature for vllm-ascend, when sleeps, we do mainly two things:

  • offload model weights
  • discard kv cache

RLHF tools(such as https://github.com/volcengine/verl and https://github.com/OpenRLHF/OpenRLHF) have a strong need of sleep mode to accelerate the training process.

This PR may solve #375 and #320 .

Does this PR introduce any user-facing change?

No existing user interfaces changed.
Users will have two new methods(sleep() and wake_up()) to use.

How was this patch tested?

This PR is tested with Qwen/Qwen2.5-0.5B-Instruct.

At first, we have free NPU memory M1.

After llm = LLM("Qwen/Qwen2.5-0.5B-Instruct", enable_sleep_mode=True) executed, we have free NPU memory M2. M2 < M1.

Then we call llm.sleep(level=1), we have free NPU memory M3.

We have M3 > M2, M3 is very close to M1.

Plus, we have the same output tokens before sleep and after wake up, with the config of SamplingParams(temperature=0, max_tokens=10) and with the same input tokens of course.

This PR is utilizing the CMake procedure of #371 , thanks a lot.

@Yikun Yikun mentioned this pull request Apr 1, 2025
37 tasks
@celestialli celestialli force-pushed the sleepmode branch 2 times, most recently from ca52a79 to be7bfcf Compare April 2, 2025 02:14
@celestialli celestialli force-pushed the sleepmode branch 8 times, most recently from c1fe715 to e428a62 Compare April 7, 2025 02:00
@celestialli celestialli changed the title [WIP] Add sleep mode feature for Ascend NPU Add sleep mode feature for Ascend NPU Apr 7, 2025
Signed-off-by: Shuqiao Li <[email protected]>
@wangxiyuan wangxiyuan merged commit 2b765dc into vllm-project:v0.7.3-dev Apr 7, 2025
13 checks passed
@celestialli celestialli deleted the sleepmode branch April 11, 2025 01:29
@Switchsyj
Copy link

Hello, do I need to build Ascend_C package from source in order to utilize this sleep() feature, as I find that this library might be necessary:

lib_name = find_loaded_library("vllm_ascend_C")

And I encountered this error while importing vllm_ascend.vllm_ascend_C:

ModuleNotFoundError: No module named 'vllm_ascend.vllm_ascend_C

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants