Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed Dispatch of Custom CUDA OP in TorchAO #1981

Closed
leslie-fang-intel opened this issue Mar 30, 2025 · 2 comments
Closed

Failed Dispatch of Custom CUDA OP in TorchAO #1981

leslie-fang-intel opened this issue Mar 30, 2025 · 2 comments

Comments

@leslie-fang-intel
Copy link
Collaborator

leslie-fang-intel commented Mar 30, 2025

Install PyTorch 2.6 and TorchAO nightly release and run custom ops' UT in TorchAO on a A100

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
pip install --pre torchao --index-url https://download.pytorch.org/whl/nightly/cu126

clear && python -u -m pytest -s -v test/test_ops.py -k test_dequantize_tensor_core_tiled_layout_correctness_quant_dequant

I get the following error

 NotImplementedError: Could not run 'torchao::dequantize_tensor_core_tiled_layout' with arguments from the 'CUDA' backend. 

The same error met, when I build torchAO from src, where I believe csrc/cuda/tensor_core_tiled_layout/tensor_core_tiled_layout.cu has been built from the building log.

@leslie-fang-intel
Copy link
Collaborator Author

It turns out to be there are 2 C extension in my system _C.cpython-310-x86_64-linux-gnu.so and _C.abi3.so. And the initiation load the so without custom kernels. Feels like there might 2 improvement:

  1. setup.py clean should remove all the existing so
  2. When there are more than one so, we may throw the error instead of using logging.debug
    logging.debug(f"Skipping import of cpp extensions: {e}")

@leslie-fang-intel
Copy link
Collaborator Author

Close it as it relates to my ENV

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant