Add sleep mode feature for Ascend NPU #416

celestialli · 2025-03-28T02:41:46Z

What this PR does / why we need it?

This PR adds sleep mode feature for vllm-ascend, when sleeps, we do mainly two things:

offload model weights
discard kv cache

RLHF tools(such as https://github.com/volcengine/verl and https://github.com/OpenRLHF/OpenRLHF) have a strong need of sleep mode to accelerate the training process.

This PR may solve #375 and #320 .

Does this PR introduce any user-facing change?

No existing user interfaces changed.
Users will have two new methods(sleep() and wake_up()) to use.

How was this patch tested?

This PR is tested with Qwen/Qwen2.5-0.5B-Instruct.

At first, we have free NPU memory M1.

After llm = LLM("Qwen/Qwen2.5-0.5B-Instruct", enable_sleep_mode=True) executed, we have free NPU memory M2. M2 < M1.

Then we call llm.sleep(level=1), we have free NPU memory M3.

We have M3 > M2, M3 is very close to M1.

Plus, we have the same output tokens before sleep and after wake up, with the config of SamplingParams(temperature=0, max_tokens=10) and with the same input tokens of course.

This PR is utilizing the CMake procedure of #371 , thanks a lot.

setup.py

Signed-off-by: Shuqiao Li <[email protected]>

Switchsyj · 2025-04-11T09:25:51Z

Hello, do I need to build Ascend_C package from source in order to utilize this sleep() feature, as I find that this library might be necessary:

lib_name = find_loaded_library("vllm_ascend_C")

And I encountered this error while importing vllm_ascend.vllm_ascend_C:

ModuleNotFoundError: No module named 'vllm_ascend.vllm_ascend_C

github-actions bot added the module:tests label Mar 28, 2025

ganyi1996ppo reviewed Mar 28, 2025

View reviewed changes

setup.py Outdated Show resolved Hide resolved

celestialli force-pushed the sleepmode branch from e9a6ca8 to 15eb913 Compare March 29, 2025 09:18

Yikun mentioned this pull request Apr 1, 2025

vLLM Ascend Roadmap Q2 2025 #448

Open

37 tasks

celestialli force-pushed the sleepmode branch 2 times, most recently from ca52a79 to be7bfcf Compare April 2, 2025 02:14

github-actions bot added the module:core label Apr 2, 2025

celestialli force-pushed the sleepmode branch 8 times, most recently from c1fe715 to e428a62 Compare April 7, 2025 02:00

celestialli changed the title ~~[WIP] Add sleep mode feature for Ascend NPU~~ Add sleep mode feature for Ascend NPU Apr 7, 2025

celestialli force-pushed the sleepmode branch from e428a62 to 2f2bac0 Compare April 7, 2025 03:03

github-actions bot removed the module:tests label Apr 7, 2025

sleep mode

5e6c45d

Signed-off-by: Shuqiao Li <[email protected]>

celestialli force-pushed the sleepmode branch from 2f2bac0 to 5e6c45d Compare April 7, 2025 03:20

wangxiyuan approved these changes Apr 7, 2025

View reviewed changes

wangxiyuan merged commit 2b765dc into vllm-project:v0.7.3-dev Apr 7, 2025
13 checks passed

celestialli deleted the sleepmode branch April 11, 2025 01:29

celestialli mentioned this pull request Apr 16, 2025

[0.7.3] Update CMakeLists.txt to adjust to more user envs #535

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sleep mode feature for Ascend NPU #416

Add sleep mode feature for Ascend NPU #416

celestialli commented Mar 28, 2025 •

edited

Loading

Switchsyj commented Apr 11, 2025

Add sleep mode feature for Ascend NPU #416

Add sleep mode feature for Ascend NPU #416

Conversation

celestialli commented Mar 28, 2025 • edited Loading

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Switchsyj commented Apr 11, 2025

celestialli commented Mar 28, 2025 •

edited

Loading