Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Triton MLA Decode Rope Kernel #232

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lucas-santos-amd
Copy link
Contributor

@lucas-santos-amd lucas-santos-amd commented Mar 20, 2025

  • Moved the _fwd_kernel_stage2_asm Triton Kernel to aiter/mla.py.
  • Renamed decode_mla.py to mla_decode_ref.py and moved it to op_tests/triton/utils. For now, it will be used as reference to the unit tests of both the ASM and the Triton MLA decode rope implementations.
  • Added the Triton MLA Decode Rope and the stage2 Kernels to the mla_decode_rope.py file.

- Moved the _fwd_kernel_stage2_asm Triton Kernel to aiter/mla.py.
- Renamed decode_mla.py to mla_decode_ref.py and moved it to
 op_tests/triton/utils. For now, it will be used as reference to the
 unit tests of both the ASM and the Triton MLA decode rope
 implementations.
- Added the Triton MLA Decode Rope and the stage2 Kernels to the
  mla_decode_ref.py file.
@lucas-santos-amd lucas-santos-amd force-pushed the lusantos/mla_decode_rope_triton branch from ad03939 to e65a7b4 Compare March 21, 2025 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant