You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 23, 2024. It is now read-only.
I was just trying to test a parallel_reduce (sum) using one of the native simd type and found a seg fault that seems to be associated with a wrong memory alignment in the return value of HostThreadTeamData::pool_reduce_local()
To illustrate this, I've updated avx.hpp to provide operator += (used in the reduce join operation), and used a custom reducer provided below.
a parallel_reduce with this reducer works fine if device is Serial, but gives me a segmentation fault when I use device OpenMP (whatever the number of threads)
If I change simd type to be e.g. simd_abi::pack<4>, the crash disappears, and it works fine.
here when compiling for avx, simd<float,simd::simd_abi::native> is 32 bytes, but when I print in reducer init the address of the reference value coming from the call to pool_reduce_local() (in HostThreadTeamData), the address is 16 bytes aligned, but I think it should be 32 bytes aligned. I think this explains the seg fault.
I may be wrong but I think it is necessary to control alignment inside HostThreadTeamData so that the returned pointer is properly align.
The text was updated successfully, but these errors were encountered:
Hello,
I was just trying to test a parallel_reduce (sum) using one of the native simd type and found a seg fault that seems to be associated with a wrong memory alignment in the return value of
HostThreadTeamData::pool_reduce_local()
To illustrate this, I've updated avx.hpp to provide operator += (used in the reduce join operation), and used a custom reducer provided below.
parallel_reduce
with this reducer works fine if device is Serial, but gives me a segmentation fault when I use device OpenMP (whatever the number of threads)simd<float,simd::simd_abi::native>
is 32 bytes, but when I print in reducer init the address of the reference value coming from the call topool_reduce_local()
(in HostThreadTeamData), the address is 16 bytes aligned, but I think it should be 32 bytes aligned. I think this explains the seg fault.I may be wrong but I think it is necessary to control alignment inside HostThreadTeamData so that the returned pointer is properly align.
The text was updated successfully, but these errors were encountered: