Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-Accelerator torch setup and unexpected uv run behaviour #12290

Open
relativityhd opened this issue Mar 18, 2025 · 0 comments
Open

Multi-Accelerator torch setup and unexpected uv run behaviour #12290

relativityhd opened this issue Mar 18, 2025 · 0 comments
Labels
question Asking for clarification or support

Comments

@relativityhd
Copy link

Question

Hi all,

I am wondering whether my problem is expected behavior or not. I also already found a workaround.

Problem

In one of my larger projects, I stumbled upon this weird behavior. This project uses PyTorch and some other packages depending on PyTorch. I set up uv following the PyTorch Integration guide from the uv documentation (via extras). When I run uv sync --extra cpu followed by uv run ... uv starts installing some nvidia- packages which should only be installed for non-cpu pytorch versions. If I run however, uv run --extra cpu ... everything is fine. I was able to reproduce the problem on a new project:

The pyproject looks like this:

[project]
name = "reproduce-torch"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.13"
dependencies = ["numpy>=2.2.4", "efficientnet-pytorch>=0.7.1"]

[project.optional-dependencies]
cpu = ["torch>=2.2.0"]
cuda = ["torch>=2.2.0"]

[tool.uv]
conflicts = [[{ extra = "cpu" }, { extra = "cuda" }]]

[tool.uv.sources]
torch = [
    { index = "pytorch-cpu", extra = "cpu" },
    { index = "pytorch-cuda", extra = "cuda" },
]


[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true

[[tool.uv.index]]
name = "pytorch-cuda"
url = "https://download.pytorch.org/whl/cu126"
explicit = true

Now locking and syncing with uv sync --extras cpu installs everything as expected (no nvidia-... packages):

$ uv sync --extra cpu
Using CPython 3.13.2
Creating virtual environment at: .venv
Resolved 30 packages in 2ms
Installed 12 packages in 331ms
 + efficientnet-pytorch==0.7.1
 + filelock==3.18.0
 + fsspec==2025.3.0
 + jinja2==3.1.6
 + markupsafe==3.0.2
 + mpmath==1.3.0
 + networkx==3.4.2
 + numpy==2.2.4
 + setuptools==76.1.0
 + sympy==1.13.1
 + torch==2.6.0+cpu
 + typing-extensions==4.12.2

A look on uv tree shows that efficientnet-pytorch depends on torch:

$ uv tree
reproduce-torch v0.1.0
├── efficientnet-pytorch v0.7.1
│   └── torch v2.6.0
│       ├── filelock v3.18.0
│       ├── fsspec v2025.3.0
│       ├── jinja2 v3.1.6
│       │   └── markupsafe v3.0.2
│       ├── networkx v3.4.2
│       ├── nvidia-cublas-cu12 v12.4.5.8
│       ├── nvidia-cuda-cupti-cu12 v12.4.127
│       ├── nvidia-cuda-nvrtc-cu12 v12.4.127
│       ├── nvidia-cuda-runtime-cu12 v12.4.127
│       ├── nvidia-cudnn-cu12 v9.1.0.70
│       │   └── nvidia-cublas-cu12 v12.4.5.8
│       ├── nvidia-cufft-cu12 v11.2.1.3
│       │   └── nvidia-nvjitlink-cu12 v12.4.127
│       ├── nvidia-curand-cu12 v10.3.5.147
│       ├── nvidia-cusolver-cu12 v11.6.1.9
│       │   ├── nvidia-cublas-cu12 v12.4.5.8
│       │   ├── nvidia-cusparse-cu12 v12.3.1.170
│       │   │   └── nvidia-nvjitlink-cu12 v12.4.127
│       │   └── nvidia-nvjitlink-cu12 v12.4.127
│       ├── nvidia-cusparse-cu12 v12.3.1.170 (*)
│       ├── nvidia-cusparselt-cu12 v0.6.2
│       ├── nvidia-nccl-cu12 v2.21.5
│       ├── nvidia-nvjitlink-cu12 v12.4.127
│       ├── nvidia-nvtx-cu12 v12.4.127
│       ├── setuptools v76.1.0
│       ├── sympy v1.13.1
│       │   └── mpmath v1.3.0
│       ├── triton v3.2.0
│       └── typing-extensions v4.12.2
├── numpy v2.2.4
└── torch v2.6.0+cu126 (extra: cuda)
(*) Package tree already displayed

Note that torch v2.6.0+cpu (extra: cpu) is missing, despite installed.

Now running uv run python starts downloading and installing the nvidia packages, this may take a few minutes since they are quite large:

# Output while installing
$ uv run python
⠧ Preparing packages... (0/9)
nvidia-cuda-nvrtc-cu12 ------------------------------ 955.00 KiB/22.58 MiB
nvidia-nvjitlink-cu12 ------------------------------ 1.11 MiB/37.44 MiB
nvidia-curand-cu12 ------------------------------ 953.81 KiB/53.85 MiB
nvidia-cufft-cu12 ------------------------------ 996.71 KiB/116.00 MiB
nvidia-cusolver-cu12 ------------------------------ 952.56 KiB/118.41 MiB
nvidia-nccl-cu12 ------------------------------ 980.81 KiB/158.30 MiB
nvidia-cusparse-cu12 ------------------------------ 979.56 KiB/186.88 MiB
nvidia-cublas-cu12 ------------------------------ 1012.81 KiB/391.57 MiB
nvidia-cudnn-cu12 ------------------------------ 979.03 KiB/697.83 MiB    
# Output after installing
$ uv run python
Installed 14 packages in 32ms
Python 3.13.2 (main, Feb  5 2025, 19:11:32) [Clang 19.1.6 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

Syncing again with uv sync --extra cpu will uninstall the unwanted packages:

uv sync --extra cpu
Resolved 30 packages in 1ms
Uninstalled 14 packages in 26ms
 - nvidia-cublas-cu12==12.4.5.8
 - nvidia-cuda-cupti-cu12==12.4.127
 - nvidia-cuda-nvrtc-cu12==12.4.127
 - nvidia-cuda-runtime-cu12==12.4.127
 - nvidia-cudnn-cu12==9.1.0.70
 - nvidia-cufft-cu12==11.2.1.3
 - nvidia-curand-cu12==10.3.5.147
 - nvidia-cusolver-cu12==11.6.1.9
 - nvidia-cusparse-cu12==12.3.1.170
 - nvidia-cusparselt-cu12==0.6.2
 - nvidia-nccl-cu12==2.21.5
 - nvidia-nvjitlink-cu12==12.4.127
 - nvidia-nvtx-cu12==12.4.127
 - triton==3.2.0

Running uv run with the cpu extra would run without installing the unwanted packages:

$ uv run --extra cpu python
Python 3.13.2 (main, Feb  5 2025, 19:11:32) [Clang 19.1.6 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

Workaround

Putting the pytorch-dependend package efficientnet-pytorch into the [project.optional-dependencies] section along torch solves the problem:

...
dependencies = ["numpy>=2.2.4"]

[project.optional-dependencies]
cpu = ["torch>=2.2.0", "efficientnet-pytorch>=0.7.1"]
cuda = ["torch>=2.2.0", "efficientnet-pytorch>=0.7.1"]
...

Final thoughts

I am not sure whether this behavior is intended by uv. The workaround works but can be tedious for larger projects which support more than two torch-versions (e.g. we try to support systems with cuda 11.8, cuda 12.1, cuda 12.4, cuda 12.6 and cpu) and / or many torch-dependent packages.

Also, I find it a little unintuitive that uv run does not always automatically apply previous synced extras.

I am happy about all thoughts, pot. better approaches and other input. 😄

P.S. if anyone has a better title for this issue which is easier for other to find I am happy for ideas.

Platform

Linux 5.15.167.4-microsoft-standard-WSL2 x86_64 GNU/Linux

Version

uv 0.6.7

@relativityhd relativityhd added the question Asking for clarification or support label Mar 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Asking for clarification or support
Projects
None yet
Development

No branches or pull requests

1 participant