Why Activation-aware Reorder works? #55

siahuat0727 · 2025-02-07T07:11:48Z

Hi, I'm trying to understand Activation-aware Reorder. I know it groups channels with similar salience, but I'm unsure why this makes them easier to quantize. Could you help to elaborate more on the benefit of this grouping? Thank you.

ys-2020 · 2025-02-26T22:38:36Z

Hi. We empirically found that group channels with similar salience can result in better quantization accuracy. We hypothesize that it might be related to the weight-activation scaling process. Specifically, we group the weight corresponding to larger activations to the same group. Before the quantization, we scale between weight and activation, which may lead to weight in the same group has similar magnitudes. That's why it may help to reduce the quantization error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why Activation-aware Reorder works? #55

Why Activation-aware Reorder works? #55

siahuat0727 commented Feb 7, 2025

ys-2020 commented Feb 26, 2025

Why Activation-aware Reorder works? #55

Why Activation-aware Reorder works? #55

Comments

siahuat0727 commented Feb 7, 2025

ys-2020 commented Feb 26, 2025