The speedup ratio of Activation-aware Recoder #53

GCQi · 2025-02-05T11:55:34Z

Activation-aware Recoder is an ingenious idea, but I didn't find speedup from fig.16 in your paper. I'm wandering the performance of Activation-aware Recoder. Could you show me the performance or tell me how to test it ?.

Thanks !

ys-2020 · 2025-02-26T21:57:36Z

Hi, thank you for your interests in QServe. The purpose of activation-aware reordering is to reduce the quantization error of model's weights. The reordering is performed offline during the model quantization stage. And it will not affect the inference speed of the model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The speedup ratio of Activation-aware Recoder #53

The speedup ratio of Activation-aware Recoder #53

GCQi commented Feb 5, 2025 •

edited

Loading

ys-2020 commented Feb 26, 2025

The speedup ratio of Activation-aware Recoder #53

The speedup ratio of Activation-aware Recoder #53

Comments

GCQi commented Feb 5, 2025 • edited Loading

ys-2020 commented Feb 26, 2025

GCQi commented Feb 5, 2025 •

edited

Loading