LoRA node has new "quant" substring in onnx graph after quantization #1583

miaoqiz · 2025-01-28T01:56:47Z

It seems that with quantization in the "auto-opt" command, it adds "quant" in LoRA related nodes in the graph

The command line is as follows:

olive auto-opt --model_name_or_path microsoft/Phi-3.5-mini-instruct --trust_remote_code --adapter_path path_adapter --output_path path_output --device cpu --provider CPUExecutionProvider --precision int4 --use_ort_genai --log_level 1"

Is there a way to reverse it to the original LoRA node name with no "quant" string?

Thanks very much!

jambayk · 2025-01-30T04:23:32Z

The .quant part was added since each adapter weight has 2/3 parameters (quantized weight, scale, and zero point) and I wanted to distinguish them from non quantized weights.

Olive/olive/passes/onnx/extract_adapters.py

Line 106 in d98186d

quantized_suffices = [".quant.weight", ".quant.scale", ".quant.zero_point"]

Do you have a use case where it cares about what the adapter weight names are?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoRA node has new "quant" substring in onnx graph after quantization #1583

LoRA node has new "quant" substring in onnx graph after quantization #1583

miaoqiz commented Jan 28, 2025

jambayk commented Jan 30, 2025

LoRA node has new "quant" substring in onnx graph after quantization #1583

LoRA node has new "quant" substring in onnx graph after quantization #1583

Comments

miaoqiz commented Jan 28, 2025

jambayk commented Jan 30, 2025