Dynamic rank not supported when quantizing the model #1584

miaoqiz · 2025-01-28T01:59:18Z

It seems that dynamic rank is not supported when quantizing the model.

The command line is as follows:

olive auto-opt --model_name_or_path microsoft/Phi-3.5-mini-instruct --trust_remote_code --adapter_path path_adapter --output_path path_output --device cpu --provider CPUExecutionProvider --precision int4 --use_ort_genai --log_level 1"

Can you kindly let me know if this will be implemented in the future or if there is a work-around?

Thanks very much!

jambayk · 2025-01-30T04:27:06Z

int4 quantized models use MatMulNBits contrib operator to do the quantized matmul. This op does not support dynamic weight shapes since K and N are baked into the operator attributes.

I think there might be some discussions on standardizing the MatMulNBits into the onnx spec with dynamic shapes but it is not supported for now unfortunately.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic rank not supported when quantizing the model #1584

Dynamic rank not supported when quantizing the model #1584

miaoqiz commented Jan 28, 2025

jambayk commented Jan 30, 2025

Dynamic rank not supported when quantizing the model #1584

Dynamic rank not supported when quantizing the model #1584

Comments

miaoqiz commented Jan 28, 2025

jambayk commented Jan 30, 2025