Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Misc] Add transpose optimization in the linear layer #280

Open
wants to merge 1 commit into
base: v0.7.3-dev
Choose a base branch
from

Conversation

rjg-lyh
Copy link

@rjg-lyh rjg-lyh commented Mar 9, 2025

What this PR does / why we need it?

In order to improve performance, extract the internal transpose operation and optimize it by transposing the Linear layer's weights after the model weights are loaded, when performing the forward inference of the Linear layer using the default non-quantized method.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Comprehensive unit tests have been performed in another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant