Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not Implemented Error "torch.ao.quantization.quantize_dynamic" #1329

Open
gurwinderintel opened this issue Jan 29, 2025 · 0 comments
Open

Not Implemented Error "torch.ao.quantization.quantize_dynamic" #1329

gurwinderintel opened this issue Jan 29, 2025 · 0 comments
Milestone

Comments

@gurwinderintel
Copy link

🐛 Describe the bug

NotImplementedError: The operator 'quantized::linear_dynamic' is not currently implemented for the XPU device. Please open a feature on https://github.com/intel/torch-xpu-ops/issues. You can set the environment variable PYTORCH_ENABLE_XPU_FALLBACK=1 to use the CPU implementation as a fallback for XPU unimplemented operators. WARNING: this will bring unexpected performance compared with running natively on XPU.

import torch
import intel_extension_for_pytorch as ipex

class M(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = torch.nn.Linear(4, 4)

    def forward(self, x):
        x = self.fc(x)
        return x

input_fp32 = torch.randn(4, 1, 4, 4).to("xpu")
model_fp32 = M()
model_fp32=model_fp32.to("xpu")

model_int8 = torch.ao.quantization.quantize_dynamic(
    model_fp32,
    {torch.nn.Linear},
    dtype=torch.qint8)

result = model_int8(input_fp32)
onednn_verbose,v1,info,oneDNN v3.7.0 (commit 6bd190dff524568f979ad08e4b26656cf8211d04)
onednn_verbose,v1,info,cpu,runtime:threadpool,nthr:6
onednn_verbose,v1,info,cpu,isa:Intel AVX-512 with AVX512BW, AVX512VL, and AVX512DQ extensions
onednn_verbose,v1,info,gpu,runtime:DPC++
onednn_verbose,v1,info,gpu,engine,sycl gpu device count:1 
onednn_verbose,v1,info,gpu,engine,0,backend:Level Zero,name:Intel(R) Graphics,driver_version:1.6.32524,binary_kernels:enabled
onednn_verbose,v1,info,graph,backend,0:dnnl_backend
onednn_verbose,v1,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,v1,graph,info,template:operation,engine,partition_id,partition_kind,op_names,data_formats,logical_tensors,fpmath_mode,implementation,backend,exec_time
onednn_verbose,v1,primitive,create:cache_miss,gpu,reorder,jit:ir,undef,src:f32::blocked:ab::f0 dst:s8::blocked:ab::f0,attr-scales:dst:0:f32,,4x4,2.78003
onednn_verbose,v1,primitive,exec,gpu,reorder,jit:ir,undef,src:f32::blocked:ab::f0 dst:s8::blocked:ab::f0,attr-scales:dst:0:f32,,4x4,76.854

return self._op(*args, **(kwargs or {})
NotImplementedError: The operator 'quantized::linear_dynamic' is not currently implemented for the XPU device. Please open a feature on https://github.com/intel/torch-xpu-ops/issues. You can set the environment variable `PYTORCH_ENABLE_XPU_FALLBACK=1` to use the CPU implementation as a fallback for XPU unimplemented operators. WARNING: this will bring unexpected performance compared with running natively on XPU.

Versions

Versions of relevant libraries:
[pip3] intel_extension_for_pytorch==2.5.10
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.4
[pip3] onnx==1.17.0
[pip3] onnxruntime==1.19.2
[pip3] torch==2.5.0a0
[pip3] torchaudio==2.1.0
[pip3] torchvision==0.16.0
[pip3] torchviz==0.0.2
[conda] Could not collect

@daisyden daisyden added this to the PT2.8 milestone Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants