You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CrsMatrixMultiplyOp should not store a local matrix device, just the rcp to the crs matrix and then in the apply calls should get the local matrix device. A consequence of storing the local matrix device is the wrapped dual view host / device use counts will be off for col idxes in non-uvm cuda builds for any user that has CrsMatrixMultiplyOp objs at the same scope as the corresponding local matrix host obj.
Thanks,
Yaro
The text was updated successfully, but these errors were encountered:
I think one issue is that the KokkosSparse::CrsMatrix requires a view and not a dual view like Tpetra stores. How would you suggest to alleviate that constraint?
I was thinking of something like the following in the apply(..), assuming this has no perf impact?
auto localMultiply =
local_matrix_op_t(std::make_shared<local_matrix_device_type>(
matrix_->getLocalMatrixDevice()));
this way it would be fine to keep CrsMatrixMultiplyOp as a member and not worry about potential conflicts w/ the hidden dual views as their scope is now limited to the apply(...) call
Hi,
CrsMatrixMultiplyOp should not store a local matrix device, just the rcp to the crs matrix and then in the apply calls should get the local matrix device. A consequence of storing the local matrix device is the wrapped dual view host / device use counts will be off for col idxes in non-uvm cuda builds for any user that has CrsMatrixMultiplyOp objs at the same scope as the corresponding local matrix host obj.
Thanks,
Yaro
The text was updated successfully, but these errors were encountered: