Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First pass at swm_fortran #26

Merged
merged 7 commits into from
Feb 25, 2025
Merged

First pass at swm_fortran #26

merged 7 commits into from
Feb 25, 2025

Conversation

mnlevy1981
Copy link
Contributor

Based on swm_c, which is funny because that was based on an old Fortran code that isn't in this library

I still need to add write_to_file() and compare those binaries to the C code, but the stdout looks good (diagonals match the C output); I'll take this out of draft when it's actually ready for consideration

Based on swm_c, which is funny because that was based on an old Fortran code
that isn't in this library

I still need to add write_to_file() and compare those binaries to the C code,
but the stdout looks good (diagonals match the C output)
This means dswap() is a copy rather than shuffling pointers around
By default this is not called, but it's useful to have for debugging
@mnlevy1981
Copy link
Contributor Author

I ran both the C and Fortran versions for a single time step and compared the differences:

Largest diff from p.bin: 0.0; rel err 0.0
Largest diff from u.bin: 1.63179985721712e-08; rel err 6.649217412385394e-08
Largest diff from v.bin: 1.63179985721712e-08; rel err 6.649217412385393e-08

I was expecting double-precision round-off differences here, I wonder if some of the C constants are being evaluated in single precision? For Fortran I'm compiling with -fdefault-real-8

Instead of using pointers and allocate() statements, I explcitly define targets
for u, v, and p at 3 different time levels and then point to those targets
Note that swm_fortran_driver is much slower than swm_fortran_driver, presumably
because we are passing large arrays to the kernels and then only changing a
single value in the arrays
the amrex_driver passes a full 3D array as an argument (inside a loop), while
the swm_fortran_driver passes a 2D array (and the loop is inside the kernel).
I think it was fortls giving some lint-like warnings...
@mnlevy1981 mnlevy1981 marked this pull request as ready for review February 25, 2025 17:16
@mnlevy1981
Copy link
Contributor Author

swm_fortran/swm_fortran_kernels.F90 (called from swm_fortran/swm_fortran_driver.F90) is where I would start adding OpenACC or OpenMP directives. In that module, the do loops are inside each of the kernels.

swm_fortran/swm_fortran_amrex_kernels.F90 (called from swm_fortran/swm_fortran_amrex_driver.F90) mimics what @hctorres did in swm_mini_app_kernels.h: the do loops are in the driver, but pointers to 3D arrays are passed to the kernel. This is significantly slower than including the loops in the kernels when built with gfortran

@johnmauff johnmauff merged commit 15f6032 into NCAR:main Feb 25, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants