-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error with Scan and temp_array #49
Comments
TLDR: Either use More in detail: The feature of temporary arrays is that they're transient. So by default several of them can be packed in one physical allocation, and if one is used for storage, the rest will be overwritten. The allocator decides on whether to store two arrays together or not based on their dependencies (that is, arrays they shouldn't intersect with). These are supposed to be declared by the user if a temporary array is created as Now the Now the problem here is that the manual usage was not a priority, so the whole system is a bit unfinished.
As it happens, computations currently don't do that (this won't be too hard to implement, I just completely forgot about it). So your On a side note, you can use
|
Understood, thank you for the explanation |
No problem. I'm going to leave this open to remind me that computations should expose their temporary arrays. |
Hello, I have issue using temporary array with a Scan. Here is my test code that compare the result with numpy :
import numpy as np
import reikna.cluda as cluda
from reikna.algorithms import Transpose, Scan, predicate_sum
api = cluda.ocl_api()
platforms = api.get_platforms()
#find AMD or NVIDIA platforms : I have a laptop I use that to avoid intel GPU
for platform in platforms :
if 'AMD' in str(platform) or 'NVIDIA' in str(platform):
break;
thread = api.Thread(platform.get_devices()[0])
data = np.random.rand(500,500,50).astype(np.float)2np.pi
axis = 2
data_gpu = thread.to_device(data)
temp = thread.temp_array(data.shape,np.float)
array =thread.array(data.shape,np.float)
program = Scan(data_gpu, predicate_sum(np.float),axes=[axis])
cumsum = program.compile(thread)
cumsum(temp,data_gpu)
cumsum(array,data_gpu)
res_temp = temp.get()
res_array = array.get()
res_numpy = np.cumsum(data,axis=axis)
print('Temp array error',np.linalg.norm(np.abs(res_temp)-np.abs(res_numpy)))
print('Array error',np.linalg.norm(np.abs(res_array)-np.abs(res_numpy)))
I do not retrieve the same result in both case. I am using a Scan in a larger program where I want to store the result of the Scan in a temporary array.
Can you help me to use temporary array in a Scan? Thanks
The text was updated successfully, but these errors were encountered: