Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic rank not supported when quantizing the model #1584

Open
miaoqiz opened this issue Jan 28, 2025 · 1 comment
Open

Dynamic rank not supported when quantizing the model #1584

miaoqiz opened this issue Jan 28, 2025 · 1 comment

Comments

@miaoqiz
Copy link

miaoqiz commented Jan 28, 2025

It seems that dynamic rank is not supported when quantizing the model.

The command line is as follows:


olive auto-opt --model_name_or_path microsoft/Phi-3.5-mini-instruct --trust_remote_code --adapter_path path_adapter --output_path path_output --device cpu --provider CPUExecutionProvider --precision int4 --use_ort_genai --log_level 1"


Can you kindly let me know if this will be implemented in the future or if there is a work-around?

Thanks very much!

@jambayk
Copy link
Contributor

jambayk commented Jan 30, 2025

int4 quantized models use MatMulNBits contrib operator to do the quantized matmul. This op does not support dynamic weight shapes since K and N are baked into the operator attributes.

I think there might be some discussions on standardizing the MatMulNBits into the onnx spec with dynamic shapes but it is not supported for now unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants