Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with Scan and temp_array #49

Open
jejmule opened this issue Nov 22, 2018 · 3 comments
Open

Error with Scan and temp_array #49

jejmule opened this issue Nov 22, 2018 · 3 comments

Comments

@jejmule
Copy link

jejmule commented Nov 22, 2018

Hello, I have issue using temporary array with a Scan. Here is my test code that compare the result with numpy :

import numpy as np
import reikna.cluda as cluda
from reikna.algorithms import Transpose, Scan, predicate_sum

api = cluda.ocl_api()
platforms = api.get_platforms()
#find AMD or NVIDIA platforms : I have a laptop I use that to avoid intel GPU
for platform in platforms :
if 'AMD' in str(platform) or 'NVIDIA' in str(platform):
break;

thread = api.Thread(platform.get_devices()[0])

data = np.random.rand(500,500,50).astype(np.float)2np.pi
axis = 2

data_gpu = thread.to_device(data)
temp = thread.temp_array(data.shape,np.float)
array =thread.array(data.shape,np.float)

program = Scan(data_gpu, predicate_sum(np.float),axes=[axis])
cumsum = program.compile(thread)
cumsum(temp,data_gpu)
cumsum(array,data_gpu)

res_temp = temp.get()
res_array = array.get()
res_numpy = np.cumsum(data,axis=axis)

print('Temp array error',np.linalg.norm(np.abs(res_temp)-np.abs(res_numpy)))
print('Array error',np.linalg.norm(np.abs(res_array)-np.abs(res_numpy)))

I do not retrieve the same result in both case. I am using a Scan in a larger program where I want to store the result of the Scan in a temporary array.

Can you help me to use temporary array in a Scan? Thanks

@fjarri
Copy link
Owner

fjarri commented Nov 22, 2018

TLDR:

Either use thread.array() for temp as well, or wrap the whole sequence of computation calls and temporary allocations into one Computation.

More in detail:

The feature of temporary arrays is that they're transient. So by default several of them can be packed in one physical allocation, and if one is used for storage, the rest will be overwritten. The allocator decides on whether to store two arrays together or not based on their dependencies (that is, arrays they shouldn't intersect with). These are supposed to be declared by the user if a temporary array is created as Thread.temp_array() - there is a dependencies keyword parameter.

Now the temp_array() method of ComputationPlan does not take dependencies. Since the whole plan is built before an actual allocation, it is aware where it is going to be used, so ComputationPlan sets the correct dependencies for you. If you're just using Thread.temp_array(), you'll have to take care of that yourself.

Now the problem here is that the manual usage was not a priority, so the whole system is a bit unfinished. Thread.temp_array() can discover dependencies recursively, if a class exposes a __tempalloc__ attribute (see TemporaryManager.array() for details). So technically, the compiled Scan object could expose its temporary arrays there, so that you could write something like

temp = thread.temp_array(data.shape,np.float, dependencies=[cumsum])

As it happens, computations currently don't do that (this won't be too hard to implement, I just completely forgot about it). So your temp can be overwritten after a call to any computation that uses temporary arrays (and Scan does).

On a side note, you can use device_filters keyword parameter to Thread.create():

thread = api.Thread.create(device_filters=dict(include_platforms=['AMD', 'NVIDIA]))

@jejmule
Copy link
Author

jejmule commented Nov 22, 2018

Understood, thank you for the explanation

@fjarri
Copy link
Owner

fjarri commented Nov 23, 2018

No problem. I'm going to leave this open to remind me that computations should expose their temporary arrays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants