-
Notifications
You must be signed in to change notification settings - Fork 754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bindless_images/sampling_3D.cpp and bindless_images/sampling_2D.cpp tests failing with UR_RESULT_ERROR_UNSUPPORTED_FEATURE on HIP/AMD #16933
Comments
This pre-commit job is still using a unsupported ROCM gpu: gfx1031. I suggest that the pre-commit HIP runner using gfx1031 is simply removed if it is not possible to run it on an officially supported GPU. |
https://github.com/intel/llvm/actions/runs/13252932583/job/36996542695?pr=16954 has
|
If all the mentioned tests still fail (sporadically) on the supported GPU, then the problem was obviously not the unsupported GPU on the other runner, so if we have no evidence the behavior is different on the supported GPU than the supported GPU, we should re-enable the runner IMO. |
Thanks, this is a legit failure, we can Xfail this for gfx1030. It is a officially supported card but it is an old card so this unsupported feature error isn't a surprise I think. |
How is this flaky if the HW is old and the feature is unsupported? |
This is the case for this test, however there are masses of other tests that fail of gfx1031 (unsupported) and pass on gfx1030. It is true that gfx1030 (RDNA2 architecture) is not an ideal card to test on, since it is very low on AMD support priority for bug fixes, and we are already aware that it does have more failures than e.g. the CDNA series cards (or I imagine RDNA3). But it is the only officially supported card available, and is much more reliable that unsupported cards. |
I didn't know this, that solves my concern. Thanks. |
@JackAKirk, could you please give more information on why gfx1031 (newer HW) is not supported? Is this a due to some SW issues in ROCM drivers? Do we have plans to support this family of AMD GPUs in the future? DPC++ users should be able to find this information in the product documentation. Right? It seems like we don't have any diagnostics in our product nor in our testing environment. I might be useful to have some checks in lit.cfg.py or DPC++ runtime for unsupported platform to give users meaningful diagnostics. It's hard for DPC++ developers to identify if the test failure is a real product issue or environmental issue. |
For a full list of amd gpus supported by ROCM drivers (and therefore the hip backend of DPC++), you can refer to the link that I referred to in e.g. this comment (maybe the surrounding conversation is useful) #7634 (comment) (note the support matrix depends on Linux/Windows platform)
I think that the plugin documentation refers users to rocm information/ informs on supported amd devices, @npmiller ?
Indeed this is a massive headache for amd developers. a short survey of issues in the https://github.com/ROCm/ROCm/issues board will give you a flavour of this. |
I've marked gfx1030 unsupported for the 3d sampling failure here: #16971 |
e.g. see ROCm/HIP#3368 |
This fixes the gfx1030 3D sampling failure mentioned here: #16933 by marking this device unsupported in the test --------- Signed-off-by: JackAKirk <[email protected]>
Describe the bug
These tests are failing on unrelated changes, see:
https://github.com/intel/llvm/actions/runs/13208597080/job/36880810856?pr=16882
https://github.com/intel/llvm/actions/runs/13209448038/job/36880861128?pr=16932
The text was updated successfully, but these errors were encountered: