Skip to content

Commit 3bff50d

Browse files
committed
Need better way to organize the estimations.
1 parent e40d046 commit 3bff50d

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

lib/nnc/mfa/v2/AttentionDescriptor.cpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -454,7 +454,7 @@ std::vector<AttentionParameterRow> AttentionDescriptor::forwardMixed(MTL::Device
454454
if (device->supportsFamily(MTL::GPUFamily(1009))) {
455455
return {
456456
AttentionParameterRow(32, 16, 128, 16, { AttentionOperand::Q, AttentionOperand::O }),
457-
AttentionParameterRow(96, 16, 128, 32, { AttentionOperand::Q, AttentionOperand::O }),
457+
AttentionParameterRow(64, 16, 128, 32, { AttentionOperand::Q, AttentionOperand::O }),
458458
AttentionParameterRow(160, 32, 128, 32, { AttentionOperand::O }),
459459
AttentionParameterRow(224, 32, 128, 32, { AttentionOperand::Q }),
460460
AttentionParameterRow(384, 32, 128, 32, {})

0 commit comments

Comments
 (0)