-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC for "Passed Directly" Customization Point #1999
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
I have a working proof of concept for this in godbolt, but needs some cleanup before sharing. Let me know if that would be useful to provide here, but likely any needed details can and should be provided in the text. |
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the name, the approach looks pretty reasonable to me.
Is there a better / more concise name than `is_passed_directly_to_sycl_kernels` we can use which properly conveys the | ||
meaning to the users? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if the name somewhere includes onedpl
. The function is not generally relevant for SYCL
. Maybe
is_passed_directly_to_onedpl
, oris_passed_directly_to_onedpl_kernels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, its a fair point that since the customization point in user's code wont be in the oneapi::dpl
namespace, there is little connecting it to oneDPL unless it is in the name. I was trying to be more descriptive about the semantic meaning as this is only relevant for the SYCL based dpcpp backend however that is probably less important than including oneDPL.
Thanks, I think the first of your suggestions is probably the best option so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this property only really makes sense for the SYCL backend of oneDPL, there is a sense that is_passed_directly_to_onedpl
doesn't quite tell the full story of what it is for. It won't be used in the TBB or OpenMP backends of oneDPL, which you wouldn't understand from just the name alone.
On the other hand, something like is_passed_directly_to_sycl
is too broad and could be confusing because the SYCL implementation won't use the function directly. Perhaps the second suggestion of is_passed_directly_to_onedpl_kernels
is closer, but one could take issue with its verbosity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps is_passed_directly_to_onedpl_dpcpp
is the right choice (I hate how wordy it is but I'm not sure we have a choice).
A normal user of oneDPL may better recognize "dpcpp" than "kernels", saves a couple characters too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actially I would prefer not to introduce more names with dpcpp
, to avoid the impression that our implementation is based on DPC++ (and not on SYCL specification).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer moving away from the oneDPL implementation detail (of "passing directly") towards the semantical meaning of the trait, something like "this iterator supports implicit data transfer" or (inverting the value) "requires explicit data transfer" or maybe "is ready for use with/suitable for oneDPL device policies".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like requires_explicit_data_transfer_onedpl_device_policies
with inverted values better than talking about implicit data transfer, because passed directly is more about not needing transfer in the first place. Implicit data transfer is appropriate for buffer accessors or shared USM, but not all "passed directly" types. USM device pointers or counting_iterators
are passed directly, but don't have any data transfer, implicit or explicit.
Another option: is_dereferencable_in_onedpl_device_policies
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I (honestly) would say is_passed_directly_to_sycl_backend
. We use 'backend' term in our documentation, so it should be clear enough. Please don't repeat _onedpl_
in the name, because it's already in oneapi::dpl
namespace. Duplication doesn't bring clarity. I am 100% agree with the intent to not use dpcpp
in the name. The rarer we use it, the better. But this is a comment about the name. I have some amount of questions to the approach itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some development from the original idea, the only public API is the trait which will include the oneapi::dpl
namespace. However, users will still be overriding the customization point by defining a function in their own namespace, and I believe that name should be associated with the trait (its name + _v
). Without repeating onedpl in the name of the customization point, there will be nothing tying it to oneDPL in the users code. This is the motivation for including _onedpl_
.
We could consider different names for the customization point and the trait, but that may also be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not use the word 'backend" in the public API (actually, not at all in the specification), and I think we should not.
There are two elements of the API to name - the trait value (also a class?) and the user-defined customization function. Their names should be related, but I am not sure if almost exact match is needed. I agree that the name of the function should refer to oneDPL; that can be achieved by adding a prefix to the function name.
Also, as far as I understand, the trait is not so much (if at all) about the iterator itself "passed" to a device, as it is about the data "underneath" the iterator being accessible from that device, so that no data copying/no intermediate buffer is required.
Can it be something like onedpl_is_iterator_device_ready()
for the function and oneapi::dpl::is_iterator_device_ready[_v]
for the trait?
Should we be targeting Experimental or fully supported with this proposal? | ||
(Do we think user feedback is required to solidify an interface / experience?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the general design of passed directly has been tested internally pretty well at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. From my perspective the only reason to keep it in experimental to start with is if we are uncertain of the exact API specifics or to find any unexpected gotchas with the approach of using a customization point generally as opposed to some other option.
Lets see what others have to say, but I'm leaning toward targeting supported, and just adding it to the specification directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that there is not much need for an experimental phase. But a POC with practical usage outside of oneDPL is necessary I think.
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
To make this robust, we will follow an C++17 updated version of what is discussed in | ||
[Eric Niebler's Post](https://ericniebler.com/2014/10/21/customization-point-design-in-c11-and-beyond/), using a | ||
callable, and using an `inline constexpr` to avoid issues with ODR and to avoid issues with resolving customization | ||
points when not separating the call to two steps with a `using` statement first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know why you refer to the Niebler's post. He describes there the C++20 customization point objects, more or less; but that's not what you propose to do. As far as I understand, the proposed user-defined customizations will be ADL discoverable (if not, then I do not understand how to use those) - and then you need the using
statement to get the default implementation in oneapi::dpl
, which makes exactly the two-step customization, does not it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I am mistaken, but my understanding is that Niebler is describing a way using function pointers to allow qualified calls to also pick up the default implementation rather than just unqualified calls with the using
statement.
We don't need his more complex strategy for ODR because of changes in C++17, but I think the function pointer strategy is still a benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I am mistaken, but my understanding is that Niebler is describing a way using function pointers to allow qualified calls to also pick up the default implementation rather than just unqualified calls with the using statement.
See https://godbolt.org/z/TM914E6fv for an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cleaned up and added my proof of concept to the description as well for another example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my understanding is that Niebler is describing a way using function pointers to allow qualified calls to also pick up the default implementation rather than just unqualified calls with the using statement.
I do not see where he uses any function pointers. His std::begin
is a reference to an instance of struct std::__detail::__begin_fn
which has a function call operator, so that std::begin(X)
is a valid code. This operator internally uses an unqualified call to begin
, which "default" implementation is in std::__detail
and specializations found by ADL. As I said, it's more or less matches the CPO design in C++20.
And it's not what you proposed so far, as far as I can see. In this proposal, the default implementation is a free function in the oneapi::dpl
namespace, and customization is a free function in the user's type namespace (it is not said explicitly that the namespace should be the same, but de-facto it should, for ADL to work). So I do not see how a qualified call will take customizations, neither how an unqualified call will take the default implementation without a using declaration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I need to improve the language to be more specific and accurate about the proposed implementation details, but I think what you describe in your first paragraph is correct and accurate to my intentions.
You are correct that an unqualified call does require a using
declaration, but a qualified call should take the user customizations because a qualified call will use the function object. The function object internally makes a unqualified call from the namespace of the default implementation on behalf of the user, allowing it to either find the more specific user customization if it exists, or end up in the default otherwise. You should be able to see this tested in the proof of concept here on line 121.
Hopefully I'm not missing something, but I may be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I was wrong to mention function pointers, I meant function objects)
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
constexpr bool is_passed_directly_in_onedpl_device_policies(const T&); | ||
``` | ||
|
||
oneDPL will provide a default implementation which will defer to the existing trait: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What header should a user include to get the default implementation? Execution? I think it would be good to specify it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a good question, and I think I agree that execution is the appropriate place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<oneapi/dpl/traits>
or oneapi/dpl/type_traits
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that the RFC has evolved a bit to make the public API a trait rather than directly calling the customization point, I think oneapi/dpl/type_traits
does make more sense than oneapi/dpl/execution
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oneapi/dpl/type_traits
makes sense to me as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should better go to <oneapi/dpl/iterator>
.
type_traits
is OK in principle, but, as far as I can tell, this thing is closer semantically to a concept representing an iterator category (like forward_iterator
) or a requirement (like sortable
) than to a generic type trait.
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
only offering customization point and trait as public API Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Provides a proposal for a public customization point for users to define to indicate if their types are passed directly to sycl kernels.
This RFC is intended to address and resolve #1939.
Here is a working proof of concept to play with: https://godbolt.org/z/jvo1esoeb (updated)