You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Activation-aware Recoder is an ingenious idea, but I didn't find speedup from fig.16 in your paper. I'm wandering the performance of Activation-aware Recoder. Could you show me the performance or tell me how to test it ?.
Thanks !
The text was updated successfully, but these errors were encountered:
Hi, thank you for your interests in QServe. The purpose of activation-aware reordering is to reduce the quantization error of model's weights. The reordering is performed offline during the model quantization stage. And it will not affect the inference speed of the model.
Activation-aware Recoder is an ingenious idea, but I didn't find speedup from fig.16 in your paper. I'm wandering the performance of Activation-aware Recoder. Could you show me the performance or tell me how to test it ?.
Thanks !
The text was updated successfully, but these errors were encountered: