-
Notifications
You must be signed in to change notification settings - Fork 765
[SYCL] Stop emission of modf intrinsic for NVPTX/AMDGCN #17958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
Conversation
A recent change in #126750 changed the lowering of the modf builtin(s) to lower to the llvm.modf intrinsic. Certain SYCL targets such as NVPTX and AMDGCN cannot currently handle this intrinsic in the backend because they don't have the necessary target library info. Even if they did, the SYCL device libraries are linked in earlier at the IR level so we'd be left with an unresolved symbol. We have been working around this at clang codegen time by avoiding the emission of intrinsics altogether by (ab)using a guard on math intrinsics. The problem with us hooking into GenerateIntrinsics like this is that SYCL is wanting to use it as a guarantee that (a certain subset of) intrinsics won't be emitted if GenerateIntrinsics is false, whereas upstream does not make this guarantee: it's an opt-in toggle for a more optimal lowering strategy. Thus we're always going to be liable to this sort of upstream change. This doesn't feel like the right mechanism for SYCL to handle these builtins (or intrinsics) long term, but this fix restores the previous behaviour without making things much worse. Fixes intel#17813
And still Matt Arsenault has approved this change. I suppose that change is safe for the clang (i.e. OpenCL/HIP/CUDA targeting AMDGCN). If I get it right, this the trend in community to lower all math functions builtins into LLVM instrinsics. Long term we should align DPC++ with the community approach. |
Yep I'll admit I don't fully know what's going on and what the intention and trend is. But we should try and get a handle on where things are heading and, yes, align with that approach. I'd like to remove this workaround so we can at least align AMDGCN/NVPTX paths with the SPIR-V and other SYCL targets. If we are left with codegen intrinsics we could perhaps have an IR-level pass which expands intrinsics (with/without specific flags or ULP requirements) to libdevice/libclc function calls. I'm assume we are ruling out binary-level linking. |
Tracker #17813 is quite coarse. It says Indeed with this patch, all the same tests are failing with the same problem, but only for the SPIR-V backend. I was hoping to keep the tracker in place and retitle/reword it to reflect that fact that it's still failing for the SPIR-V backend, rather than closing it and opening another almost identical one. Is that okay? |
I'm not following . The original change broke NVPTX and AMDGCN, why SPIR-V backend is failing? |
The SPIR-V backend can't handle the I've been made vaguely aware of changes in the closed source repo that align this |
A recent change in #126750 changed the lowering of the modf builtin(s) to lower to the llvm.modf intrinsic. Certain SYCL targets such as NVPTX and AMDGCN cannot currently handle this intrinsic in the backend because they don't have the necessary target library info. Even if they did, the SYCL device libraries are linked in earlier at the IR level so we'd be left with an unresolved symbol.
We have been working around this at clang codegen time by avoiding the emission of intrinsics altogether by (ab)using a guard on math intrinsics.
The problem with us hooking into GenerateIntrinsics like this is that SYCL is wanting to use it as a guarantee that (a certain subset of) intrinsics won't be emitted if GenerateIntrinsics is false, whereas upstream does not make this guarantee: it's an opt-in toggle for a more optimal lowering strategy. Thus we're always going to be liable to this sort of upstream change.
This doesn't feel like the right mechanism for SYCL to handle these builtins (or intrinsics) long term, but this fix restores the previous behaviour without making things much worse.
Fixes the AMDGCN/NVPTX aspects of #17813