Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce 'PragmaModelTransformation' and preliminary OpenMP offload #485

Open
wants to merge 1 commit into
base: nams-scc-seq-revector
Choose a base branch
from

Conversation

MichaelSt98
Copy link
Collaborator

Introduce 'PragmaModelTransformation' and preliminary OpenMP offload

With this all transformation insert "generic" loki pragmas which are then transformed/mapped as (one of the last steps) using the 'PragmaModelTransformation' to a specific Pragma model (e.g., OpenACC, OpenMP offload).

Copy link

github-actions bot commented Feb 6, 2025

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/485/index.html

Copy link

codecov bot commented Feb 6, 2025

Codecov Report

Attention: Patch coverage is 95.06173% with 24 lines in your changes missing coverage. Please review.

Project coverage is 96.12%. Comparing base (f3591ec) to head (84ce65b).
Report is 3 commits behind head on nams-scc-seq-revector.

Files with missing lines Patch % Lines
loki/transformations/pragma_model.py 92.82% 16 Missing ⚠️
loki/transformations/single_column/scc_cuf.py 50.00% 4 Missing ⚠️
loki/ir/pragma_utils.py 88.88% 2 Missing ⚠️
loki/transformations/single_column/annotate.py 94.73% 2 Missing ⚠️
Additional details and impacted files
@@                    Coverage Diff                    @@
##           nams-scc-seq-revector     #485      +/-   ##
=========================================================
- Coverage                  96.13%   96.12%   -0.02%     
=========================================================
  Files                        224      226       +2     
  Lines                      40582    40991     +409     
=========================================================
+ Hits                       39015    39404     +389     
- Misses                      1567     1587      +20     
Flag Coverage Δ
lint_rules 96.39% <ø> (ø)
loki 96.12% <95.06%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@MichaelSt98
Copy link
Collaborator Author

MichaelSt98 commented Feb 7, 2025

OpenMP offload incomplete ...

Loki OpenACC OMP-GPU
create device(...) declare create(...)
update device(...) host(...) update device(...) self(...)
unscoped-data in(...) create(...) enter data copyin(...) create(...) target enter data map(to: ...) map(alloc: ...)
end unscoped-data out(...) delete(...) exit data copyout(...) delete(...) target exit data map(from: ...) map(delete: ...) map(release: ... ???)
scoped-data inout(...) in(...) out(...) create(...) data copy(...) copyin(...) copyout(...) create(...) target data map(tofrom: ...) map(to: ...) map(from: ...)
end scoped-data inout(...) in(...) out(...) end data end target data
loop gang private(...) vlength(...) parallel loop gang private(...) vector_length(...) target teams distribute thread_limit(...) ???
end loop gang end parallel loop end target teams distribute
loop vector private(...) reduction(...) loop vector private(...) reduction(...) parallel do
end loop vector - end parallel do
loop seq loop seq -
end loop seq - -
routine vector routine vector not allowed/supported?
routine seq routine seq declare target
data device-present vars(...) data present() ?
device-present vars (...) data present(...) ?
end device-present vars(...) end data ?
device-ptr vars (...) data deviceptr(...) ?
end device-ptr vars(...) end data

@MichaelSt98 MichaelSt98 marked this pull request as draft February 10, 2025 08:54
@MichaelSt98 MichaelSt98 force-pushed the nams-programming-model branch from 643f400 to 84ce65b Compare February 13, 2025 10:01
@MichaelSt98 MichaelSt98 marked this pull request as ready for review February 13, 2025 11:11
Copy link
Collaborator

@reuterbal reuterbal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks, this is a really cool development!
Conceptually this is fantastic and I don't have anything to change there. But a few other things could be improved, which I've mostly marked with inline comments.

A few more general remarks here:

I tried to do this during the review but it would be good if you could also double-check where the switch to loki directives has implications on the transformation beyond updating the keyword and naming of directives. For example:

  • The SCCAnnotate transformation shouldn't care about the directive type anymore, and anything related to this should be possible to remove
  • In a few places there used to be a direct translation from !$loki ... to !$acc ... - which seems sometimes redundant now. I've tried to flag this but it's likely I overlooked this in other situtaitons.

I know that even after this PR the Loki directives are not set in stone and we can update them as required. But it's likely a good idea to sanity check them on this occasion. I've already flagged some things in inline comments, but particularly for structured directives that consist of !$loki command and !$loki end command pairs, I don't see any reason why the end statement should include any additional arguments, as it currently does for end scoped-data.

Other than that, the usual nagging about some small things, docs and tests ;-)

Comment on lines +145 to +162
pragma_parameters = PragmaParameters()
parameters = defaultdict(list)
if pragma.keyword.lower() != 'loki':
return None, None
content = pragma.content or ''
# Remove any line-continuation markers
content = content.replace('&', '')
starts_with = content.split(' ')[0]
if starts_with == 'end':
starts_with = f"{starts_with}-{content.split(' ')[1]}"
if not starts_with:
return None, None
content = content[len(starts_with):]
parameter = pragma_parameters.find(content)
for key in parameter:
parameters[key].append(parameter[key])
parameters = {k: v if len(v) > 1 else v[0] for k, v in parameters.items()}
return starts_with, parameters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This largely replicates the other get_pragma_parameters utility, except for the explicit awareness of the command and the matching end key. Can we maybe consolidate the two, or re-use common code?

(Also, the naming of the utility isn't quite right, I think)

An untested idea in this direction:

def get_pragma_command_and_parameters(pragma, only_loki_pragmas=True):
    pragma_parameters = list(get_pragma_parameters(pragma, only_loki_pragmas=only_loki_pragmas))
    if not pragma_parameters:
        return None, None
    if pragma_parameters[0] == 'end':
        if len(pragma_parameters) < 2 or pragma_parameters[1][1] is not None:
            debug('get_pragma_command_and_parameters: Failed to match end-command in pragma {pragma}')
            return None, None
        pragma_parameters = [
            (f'{pragma_parameters[0][0]}-{pragma_parameters[1][0]}', None),
            *pragma_parameters[2:]
        ]
    return pragma_parameters[0][0], dict(pragma_parameters[1:])

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And while we're at it, can we please have a test for this utility?

@@ -759,6 +758,7 @@ def create_pool_allocator(self, routine, stack_size):
stack_ptr = self._get_stack_ptr(routine)
stack_end = self._get_stack_end(routine)

# TODO: generalise using generic Loki pragmas?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

"""
Loki generic pragmas to OpenACC mapper.
"""

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you put this here at class-member indentation level, then you shouldn't need to repeat this for every method:

Suggested change
# pylint: disable=unused-argument

Comment on lines +287 to +294
pmapper_map = {'openacc': OpenACCPragmaMapper(), 'omp-gpu': OpenMPOffloadPragmaMapper()}
if self.directive in pmapper_map:
self.pmapper = pmapper_map[self.directive]
else:
if self.keep_loki_pragmas:
self.pmapper = None
else:
self.pmapper = GenericPragmaMapper()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to limit instantation only to the actual mapper that is going to be used, you could use a pattern like this:

Suggested change
pmapper_map = {'openacc': OpenACCPragmaMapper(), 'omp-gpu': OpenMPOffloadPragmaMapper()}
if self.directive in pmapper_map:
self.pmapper = pmapper_map[self.directive]
else:
if self.keep_loki_pragmas:
self.pmapper = None
else:
self.pmapper = GenericPragmaMapper()
pmapper_cls_map = {
'openacc': OpenACCPragmaMapper,
'omp-gpu': OpenMPOffloadPragmaMapper,
}
pmapper_cls = pmapper_map.get(self.directive, None if self.keep_loki_pragmas else GenericPragmaMapper)
self.pmapper = pmapper_cls() if pmapper_cls else None

@@ -74,14 +74,14 @@ def annotate_vector_loops(self, routine):
for pragma in as_tuple(loop.pragma):
if is_loki_pragma(pragma, starts_with='loop vector reduction'):
# Turn reduction pragmas into `!$acc` equivalent
pragma._update(keyword='acc')
pragma._update(keyword='loki')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The control flow of this loop doesn't make too much sense anymore, given that there's no longer a translation into acc directives. I think this could be simplified as follows:

                if private_arrays:
                    for pragma in as_tuple(loop.pragma):
                        if 'reduction' not in (pragma_parameters := get_pragma_parameters(pragma, starts_with='loop vector')):
                            # Add private clause
                            pragma_parameters['private'] = ', '.join(
                                v.name 
                                for v in pragma_parameters.get('private', []) + private_arrays
                            )
                            pragma_content = [f'{kw}({val})' if val else kw for kw, val in pragma_parameters]
                            pragma._update(content=f'loop vector {" ".join(pragma_content)}'.strip())

Comment on lines +46 to +49
!$loki unscoped-data in(tmp1, tmp2) create(tmp3, tmp4)
!$loki end unscoped-data out(tmp2, tmp3, tmp4) delete(tmp1)
!$loki scoped-data in(tmp1) out(tmp2) inout(tmp3) create(tmp4)
!$loki end scoped-data in(tmp1) out(tmp2) inout(tmp3) create(tmp4)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% happy with the "scoped/unscoped" data nomenclature, because data still has a scope. Both OpenMP and OpenACC call the distinction between the two "structured/unstructured data directives". Maybe we should adopt that here?


item_filter = (ProcedureItem, ModuleItem)

def __init__(self, directive=None, keep_loki_pragmas=True, process_module_items=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what situation would I not want to process module items?

Comment on lines 271 to +274
if column_locals:
vnames = ', '.join(v.name for v in column_locals)
pragma = ir.Pragma(keyword='acc', content=f'enter data create({vnames})')
pragma_post = ir.Pragma(keyword='acc', content=f'exit data delete({vnames})')
pragma = ir.Pragma(keyword='loki', content=f'unscoped-data create({vnames})')
pragma_post = ir.Pragma(keyword='loki', content=f'unscoped-enddata delete({vnames})')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire section is not covered by tests. What situation is this for, what would be required to capture this?
(Cc @mlange05)


class OpenACCPragmaMapper(GenericPragmaMapper):
"""
Loki generic pragmas to OpenACC mapper.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please add here the table with the directive translation between loki, openacc and openmp?
And a documentation of the constructor arguments, please.

def pmap_create(self, pragma, parameters, **kwargs): # pylint: disable=unused-argument
if param_device := parameters.get('device'):
return Pragma(keyword='acc', content=f'declare create({param_device})')
return self.default_retval()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default_retval case is never hit in tests, and I'm wondering if we shouldn't better raise an exception than returning something if we have a Loki directive that doesn't match the "spec".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants