You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
int4 quantized models use MatMulNBits contrib operator to do the quantized matmul. This op does not support dynamic weight shapes since K and N are baked into the operator attributes.
I think there might be some discussions on standardizing the MatMulNBits into the onnx spec with dynamic shapes but it is not supported for now unfortunately.
It seems that dynamic rank is not supported when quantizing the model.
The command line is as follows:
olive auto-opt --model_name_or_path microsoft/Phi-3.5-mini-instruct --trust_remote_code --adapter_path path_adapter --output_path path_output --device cpu --provider CPUExecutionProvider --precision int4 --use_ort_genai --log_level 1"
Can you kindly let me know if this will be implemented in the future or if there is a work-around?
Thanks very much!
The text was updated successfully, but these errors were encountered: