Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] spgemm #550

Draft
wants to merge 70 commits into
base: master
Choose a base branch
from
Draft

Conversation

abagusetty
Copy link
Contributor

  • Updates spgemm functionality for SYCL
  • majority of the code remains the same as with CUDA for easier maintenance
  • Depends on the infrastructure from [SYCL] Seq mv sycl #538

Wayne Mitchell and others added 30 commits July 24, 2021 00:40
This is an initial commit that still needs some reworking
and debugging.
This does a bunch of name changing of files, data structures,
and variables from 'cuda' to 'device' in order to reflect which
things are generic device functionality vs. tied to a specific
language. In addition, this now compiles and runs a simple program
that calls HYPRE_Init() and allocates/copies/frees memory on the
device and host with unified memory.
Quick fix for compilation --with-cuda. Ran some tests on lassen
and quartz as well to make sure I didn't break the cuda or cpu
versions.
Modified csr matvec to choose the default execution policy
instead of hard-coded device policy. This now passes tests
and seems to run as expected using sycl unified memory and
using host execution for everything.
Starting to put in boxloop sycl code. This compiles,
but crashes.
I have fixed my compilation issues and can now run with my
sycl boxloop1 implementation on frank's sever machine. The
boxloop1 code seems to be giving correct results as well,
though it seems somewhere along the line I screwed up the
struct solvers tests, which yield a discrepancy in number of
iterations for the first solvers.jobs job.
The non-reduction boxloops are all in and pass the struct tests.
Performance is VERY slow, but this may just be due to the machine
I am running on. Reduction boxloops are in progress.
The reduction boxloops are implemented and pass the
struct solvers.sh tests. Cleanup of boxloop_sycl.h.
Uses shared memory pointer instead of buffers and accessors.
Seems to work on iris, same error as before on arcticus.
…m() to align with hypre-space#549

2. CMakeLists support for SYCL build
3. remove dpct:: functions and include the implementations
@abagusetty abagusetty marked this pull request as draft February 15, 2022 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants