Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: define behavior for __setitem__ when the assigned value promotes to the array's dtype #916

Open
crusaderky opened this issue Mar 25, 2025 · 6 comments · May be fixed by #920
Open

RFC: define behavior for __setitem__ when the assigned value promotes to the array's dtype #916

crusaderky opened this issue Mar 25, 2025 · 6 comments · May be fixed by #920
Labels
RFC Request for comments. Feature requests and proposed changes.
Milestone

Comments

@crusaderky
Copy link
Contributor

crusaderky commented Mar 25, 2025

__setitem__ states:
https://data-apis.org/array-api/2024.12/API_specification/generated/array_api.array.__setitem__.html#array_api.array.__setitem__

Setting array values must not affect the data type of self.

When value is a Python scalar (i.e., int, float, complex, bool), behavior must follow specification guidance on mixing arrays with Python scalars (see Type Promotion Rules).

When value is an array of a different data type than self, how values are cast to the data type of self is implementation defined.

The last line is majorly problematic in my opinion and should be revised.
These use cases are left to the library's discretion:

  • setting a float array value into an integer array, e.g.
>>> a = xp.asarray([1], dtype=xp.int64)
>>> a[0] = xp.asarray(1.0)
  • setting a signed array value into an unsigned array, e.g.
>>> a = xp.asarray([1], dtype=xp.unt64)
>>> a[0] = xp.asarray(-2, dtype=xp.int8)
  • setting an array value with a larger dtype into a smaller dtype of the same kind, e.g.
>>> a = xp.asarray([1], dtype=xp.int8)
>>> a[0] = xp.asarray(2, dtype=xp.int64)

All these use cases are quietly allowed by numpy.
Sanity would dictate for them to be all disallowed, exactly like in binops.

If the last use case is allowed, it also opens the issue of in-place binops (__iadd__) etc.
Whereas out-of-place binops unambiguously must cast the smaller dtype to the larger one and then perform the operation, what is the correct process for in-place ones, when lhs has smaller dtype than rhs?

lhs[idx] = op(lhs[idx], xp.astype(rhs[idx], lhs.dtype))

or

lhs[idx] = xp.astype(op(lhs[idx], rhs[idx]), lhs.dtype)

which is the same as saying

t = xp.result_type(lhs, rhs)
lhs[idx] = xp.astype(op(xp.astype(lhs[idx], t), xp.astype(rhs[idx], t)), lhs.dtype)

? the output will subtly differ.

@crusaderky crusaderky changed the title Clarification on __setitem__ and in-place ops type promotion rules Tighten __setitem__ and in-place ops type promotion rules Mar 25, 2025
@crusaderky
Copy link
Contributor Author

crusaderky commented Mar 25, 2025

In my opinion, the wording should be changed from

  • Setting array values must not affect the data type of self.
  • When value is a Python scalar (i.e., int, float, complex, bool), behavior must follow specification guidance on mixing arrays with Python scalars (see Type Promotion Rules).
  • When value is an array of a different data type than self, how values are cast to the data type of self is implementation defined.

to

  • Setting array values must not affect the data type of self.
  • Behavior must otherwise follow Type Promotion Rules.
  • If it's not possible to respect the rules by just altering the value being set (for example, when setting a int32 value into a int16 array), behavior is undefined.

@rgommers
Copy link
Member

That wording change sounds fine to me.

@mdhaber
Copy link
Contributor

mdhaber commented Mar 25, 2025

Assuming "promote value to [type]" is unambiguous1, is the last bullet different from:

If value does not promote to self.dtype, behavior is undefined.

?

If so, maybe the three rules can be summarized

value must be promoted to self.dtype according to Type Promotion Rules; if this is not possible, behavior is undefined.

Or maybe that is only the second and third rule. I suppose the first rule is independent, because it defines what can't happen.

Footnotes

  1. leave it alone if value.dtype == self.dtype, otherwise follow arrows from value.dtype to self.dtype

@crusaderky
Copy link
Contributor Author

crusaderky commented Mar 26, 2025

Or maybe that is only the second and third rule. I suppose the first rule is independent, because it defines what can't happen.

In this case I would favor a little bit of potential redundancy for the sake of clarity.
I like the suggestion to squash lines 2 and 3:

  • Setting array values must not affect the data type of self.
  • value must be promoted to self.dtype according to Type Promotion Rules; if this is not possible, behavior is undefined.

@crusaderky
Copy link
Contributor Author

Should the new wording only affect the draft, or also be backported to previous versions?
To me it feels that this doesn't change the intended expected behaviour and it is just a clarification.

@kgryte kgryte changed the title Tighten __setitem__ and in-place ops type promotion rules RFC: define behavior for __setitem__ when the assigned value promotes to the array's dtype Apr 3, 2025
@kgryte kgryte added the RFC Request for comments. Feature requests and proposed changes. label Apr 3, 2025
@kgryte kgryte added this to the v2025 milestone Apr 3, 2025
@kgryte
Copy link
Contributor

kgryte commented Apr 3, 2025

To me it feels that this doesn't change the intended expected behaviour and it is just a clarification.

@crusaderky Only the draft for this one, I think.

@crusaderky crusaderky linked a pull request Apr 3, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC Request for comments. Feature requests and proposed changes.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants