Pascal support #223

cduk · 2024-05-16T20:30:25Z

cduk
May 16, 2024

The description mentions flash attention which is not supported on Pascal. Does anyone know if flash attention is optional such that it is still possible to run on Pascal class GPUs e.g. P40, P100 or whethet it is required so that these older cards are not supported?

Answered by michaelfeil

May 16, 2024

Hey @cduk ,

it used torch s functional F.sdpa. This falls to FA2 or memory efficent attention, or plain MHA depending on your hardware (cpu, cuda compute capability / cache). Pascal therefore is supported.

View full answer

michaelfeil · 2024-05-16T20:43:58Z

michaelfeil
May 16, 2024
Maintainer

Hey @cduk ,

it used torch s functional F.sdpa. This falls to FA2 or memory efficent attention, or plain MHA depending on your hardware (cpu, cuda compute capability / cache). Pascal therefore is supported.

1 reply

cduk May 17, 2024
Author

Thanks for the quick reply! TEI says older compute not supported, so maybe this is a selling point for Infinity!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pascal support #223

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Pascal support #223

cduk May 16, 2024

Replies: 1 comment · 1 reply

michaelfeil May 16, 2024 Maintainer

cduk May 17, 2024 Author

cduk
May 16, 2024

Replies: 1 comment 1 reply

michaelfeil
May 16, 2024
Maintainer

cduk May 17, 2024
Author