Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does torchao support FP8 Grouped GEMM? #1928

Open
zigzagcai opened this issue Mar 20, 2025 · 5 comments
Open

Does torchao support FP8 Grouped GEMM? #1928

zigzagcai opened this issue Mar 20, 2025 · 5 comments
Labels

Comments

@zigzagcai
Copy link

Grouped GEMM kernels (https://github.com/fanshiqing/grouped_gemm) are used in many MoE models.

I just wander does torchao support FP8 kernels for Grouped GEMM, such like the three commonly used ops:

grouped_gemm.backend.gmm
grouped_gemm.ops.unpermute
grouped_gemm.ops.permute
@vkuzo
Copy link
Contributor

vkuzo commented Mar 20, 2025

hi @zigzagcai , we recently landed a grouped gemm API into core which includes fp8: pytorch/pytorch#148531 . We plan to provide wrappers in torchao, although we do not have them just yet. cc @drisspg

@zigzagcai
Copy link
Author

hi @zigzagcai , we recently landed a grouped gemm API into core which includes fp8: pytorch/pytorch#148531 . We plan to provide wrappers in torchao, although we do not have them just yet. cc @drisspg

Thank you @vkuzo !
I just wander how can I use this aten newly needed grouped gemm ops?

@supriyar
Copy link
Contributor

cc @HDCharles who has been looking into MoE quantization and grouped gemm recently

@HDCharles
Copy link
Contributor

Hey,

I'm working to enable our existing quantization kernels to compose with group gemm its still in progress at the moment. As far as the core kernel, you can look at: https://github.com/pytorch/pytorch/pull/148531/files#diff-3f31c52b48cfddf8f4617d809f7695b2e4a1c78656f8c4b5143a4b45d01fcf23R1178

...for an example

@jeromeku
Copy link
Contributor

@HDCharles @vkuzo

Interested in this as well and potentially helping tune the kernel.

There is a link mentioned in the grouped gemm PR describing the design of the grouped GEMM. How can I view the doc (access seems to be gated)?

@drisspg drisspg added the float8 label Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants