Not Implemented Error "torch.ao.quantization.quantize_dynamic" #1329

gurwinderintel · 2025-01-29T09:53:32Z

🐛 Describe the bug

NotImplementedError: The operator 'quantized::linear_dynamic' is not currently implemented for the XPU device. Please open a feature on https://github.com/intel/torch-xpu-ops/issues. You can set the environment variable PYTORCH_ENABLE_XPU_FALLBACK=1 to use the CPU implementation as a fallback for XPU unimplemented operators. WARNING: this will bring unexpected performance compared with running natively on XPU.

import torch
import intel_extension_for_pytorch as ipex

class M(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = torch.nn.Linear(4, 4)

    def forward(self, x):
        x = self.fc(x)
        return x

input_fp32 = torch.randn(4, 1, 4, 4).to("xpu")
model_fp32 = M()
model_fp32=model_fp32.to("xpu")

model_int8 = torch.ao.quantization.quantize_dynamic(
    model_fp32,
    {torch.nn.Linear},
    dtype=torch.qint8)

result = model_int8(input_fp32)

onednn_verbose,v1,info,oneDNN v3.7.0 (commit 6bd190dff524568f979ad08e4b26656cf8211d04)
onednn_verbose,v1,info,cpu,runtime:threadpool,nthr:6
onednn_verbose,v1,info,cpu,isa:Intel AVX-512 with AVX512BW, AVX512VL, and AVX512DQ extensions
onednn_verbose,v1,info,gpu,runtime:DPC++
onednn_verbose,v1,info,gpu,engine,sycl gpu device count:1 
onednn_verbose,v1,info,gpu,engine,0,backend:Level Zero,name:Intel(R) Graphics,driver_version:1.6.32524,binary_kernels:enabled
onednn_verbose,v1,info,graph,backend,0:dnnl_backend
onednn_verbose,v1,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,v1,graph,info,template:operation,engine,partition_id,partition_kind,op_names,data_formats,logical_tensors,fpmath_mode,implementation,backend,exec_time
onednn_verbose,v1,primitive,create:cache_miss,gpu,reorder,jit:ir,undef,src:f32::blocked:ab::f0 dst:s8::blocked:ab::f0,attr-scales:dst:0:f32,,4x4,2.78003
onednn_verbose,v1,primitive,exec,gpu,reorder,jit:ir,undef,src:f32::blocked:ab::f0 dst:s8::blocked:ab::f0,attr-scales:dst:0:f32,,4x4,76.854

return self._op(*args, **(kwargs or {})
NotImplementedError: The operator 'quantized::linear_dynamic' is not currently implemented for the XPU device. Please open a feature on https://github.com/intel/torch-xpu-ops/issues. You can set the environment variable `PYTORCH_ENABLE_XPU_FALLBACK=1` to use the CPU implementation as a fallback for XPU unimplemented operators. WARNING: this will bring unexpected performance compared with running natively on XPU.

Versions

Versions of relevant libraries:
[pip3] intel_extension_for_pytorch==2.5.10
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.4
[pip3] onnx==1.17.0
[pip3] onnxruntime==1.19.2
[pip3] torch==2.5.0a0
[pip3] torchaudio==2.1.0
[pip3] torchvision==0.16.0
[pip3] torchviz==0.0.2
[conda] Could not collect

The text was updated successfully, but these errors were encountered:

daisyden added this to the PT2.8 milestone Feb 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not Implemented Error "torch.ao.quantization.quantize_dynamic" #1329

Not Implemented Error "torch.ao.quantization.quantize_dynamic" #1329

gurwinderintel commented Jan 29, 2025

Not Implemented Error "torch.ao.quantization.quantize_dynamic" #1329

Not Implemented Error "torch.ao.quantization.quantize_dynamic" #1329

Comments

gurwinderintel commented Jan 29, 2025

🐛 Describe the bug

Versions