Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why Activation-aware Reorder works? #55

Open
siahuat0727 opened this issue Feb 7, 2025 · 1 comment
Open

Why Activation-aware Reorder works? #55

siahuat0727 opened this issue Feb 7, 2025 · 1 comment

Comments

@siahuat0727
Copy link

Hi, I'm trying to understand Activation-aware Reorder. I know it groups channels with similar salience, but I'm unsure why this makes them easier to quantize. Could you help to elaborate more on the benefit of this grouping? Thank you.

@ys-2020
Copy link
Contributor

ys-2020 commented Feb 26, 2025

Hi. We empirically found that group channels with similar salience can result in better quantization accuracy. We hypothesize that it might be related to the weight-activation scaling process. Specifically, we group the weight corresponding to larger activations to the same group. Before the quantization, we scale between weight and activation, which may lead to weight in the same group has similar magnitudes. That's why it may help to reduce the quantization error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants