Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make lazy mode autodetection more robust #921

Open
wants to merge 5 commits into
base: habana_main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Make lazy mode autodetection more robust
kzawora-intel authored Mar 17, 2025

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
commit 28e679c068710c477c6efc914f0f87d07fed7013
20 changes: 16 additions & 4 deletions vllm/plugins/__init__.py
Original file line number Diff line number Diff line change
@@ -68,14 +68,26 @@
# does not support torch.compile
# Eager backend (PT_HPU_LAZY_MODE = 0) must be selected for
# torch.compile support
is_lazy = os.environ.get('PT_HPU_LAZY_MODE', '1') == '1'
if is_lazy:
torch._dynamo.config.disable = True
_environ = dict(os.environ)
env_update_dict = {}
try:
# NOTE(kzawora) multi-HPU inference with HPUGraphs (lazy-only)
# requires enabling lazy collectives
# see https://docs.habana.ai/en/latest/PyTorch/Inference_on_PyTorch/Inference_Using_HPU_Graphs.html # noqa: E501
# this does nothing for eager/t.compile
os.environ['PT_HPU_ENABLE_LAZY_COLLECTIVES'] = 'true'

lazy_mode_env_var = os.environ('PT_HPU_LAZY_MODE', None)
is_lazy = lazy_mode_env_var == '1'
if lazy_mode_env_var is None:
import habana_frameworks.torch as htorch
is_lazy - htorch.utils.internal.is_lazy{}

Check failure on line 83 in vllm/plugins/__init__.py

GitHub Actions / pre-commit

Ruff

vllm/plugins/__init__.py:83:56: SyntaxError: Simple statements must be separated by newlines or semicolons

Check failure on line 83 in vllm/plugins/__init__.py

GitHub Actions / pre-commit

invalid syntax [syntax]

Check failure on line 83 in vllm/plugins/__init__.py

GitHub Actions / pre-commit

invalid syntax [syntax]

Check failure on line 83 in vllm/plugins/__init__.py

GitHub Actions / pre-commit

invalid syntax [syntax]

Check failure on line 83 in vllm/plugins/__init__.py

GitHub Actions / pre-commit

invalid syntax [syntax]

Check failure on line 83 in vllm/plugins/__init__.py

GitHub Actions / pre-commit

invalid syntax [syntax]

Check failure on line 83 in vllm/plugins/__init__.py

GitHub Actions / pre-commit

invalid syntax [syntax]

Check failure on line 83 in vllm/plugins/__init__.py

GitHub Actions / pre-commit

invalid syntax [syntax]

Check failure on line 83 in vllm/plugins/__init__.py

GitHub Actions / pre-commit

invalid syntax [syntax]

Check failure on line 83 in vllm/plugins/__init__.py

GitHub Actions / pre-commit

invalid syntax [syntax]
if is_lazy:
torch._dynamo.config.disable = True
env_update_dict['PT_HPU_ENABLE_LAZY_COLLECTIVES'] = 'true'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to clean this up? I'm assuming PT_HPU_ENABLE_LAZY_COLLECTIVES only affects lazy and is ignored in other cases. If we skip cleanup we could end up with something as simple as:

os.environ['PT_HPU_ENABLE_LAZY_COLLECTIVES'] = 'true'
import habana_frameworks.torch as htorch
if htorch.utils.internal.is_lazy:
    torch._dynamo.config.disable = True

finally:
os.environ.clear()
os.environ.update(_environ)
os.environ.update(update_dict)
plugins = load_plugins_by_group(group='vllm.general_plugins')
# general plugins, we only need to execute the loaded functions
for func in plugins.values():