From 4863912ab55982e9b7dce7da929f3e44e76b00cf Mon Sep 17 00:00:00 2001 From: Athan Reines Date: Wed, 26 Feb 2025 02:28:26 -0800 Subject: [PATCH 1/2] docs: add design topic on `copy` keyword argument behavior Closes: https://github.com/data-apis/array-api/pull/886 Closes: https://github.com/data-apis/array-api/issues/866 --- .../copies_views_and_mutation.rst | 106 +++++++++++------- 1 file changed, 64 insertions(+), 42 deletions(-) diff --git a/spec/draft/design_topics/copies_views_and_mutation.rst b/spec/draft/design_topics/copies_views_and_mutation.rst index 1ca5a039c..3bbeae165 100644 --- a/spec/draft/design_topics/copies_views_and_mutation.rst +++ b/spec/draft/design_topics/copies_views_and_mutation.rst @@ -1,6 +1,6 @@ .. _copyview-mutability: -Copy-view behaviour and mutability +Copy-view behavior and mutability ================================== .. admonition:: Mutating views @@ -10,11 +10,11 @@ Copy-view behaviour and mutability Strided array implementations (e.g. NumPy, PyTorch, CuPy, MXNet) typically have the concept of a "view", meaning an array containing data in memory that -belongs to another array (i.e. a different "view" on the original data). -Views are useful for performance reasons - not copying data to a new location -saves memory and is faster than copying - but can also affect the semantics +belongs to another array (i.e., a different "view" on the original data). +Views are useful for performance reasons—not copying data to a new location +saves memory and is faster than copying—but can also affect the semantics of code. This happens when views are combined with *mutating* operations. -This simple example illustrates that: +The following example is illustrative: .. code-block:: python @@ -22,56 +22,78 @@ This simple example illustrates that: y = x[:] # `y` *may* be a view on the data of `x` y -= 1 # if `y` is a view, this modifies `x` -Code as simple as the above example will not be portable between array -libraries - for NumPy/PyTorch/CuPy/MXNet ``x`` will contain the value ``0``, -while for TensorFlow/JAX/Dask it will contain the value ``1``. The combination -of views and mutability is fundamentally problematic here if the goal is to -be able to write code with unambiguous semantics. +Code similar to the above example will not be portable between array +libraries. For example, for NumPy, PyTorch, and CuPy, ``x`` will contain the value ``0``, +while, for TensorFlow, JAX, and Dask, ``x`` will contain the value ``1``. In +this case, the combination of views and mutability is fundamentally problematic +if the goal is to be able to write code with unambiguous semantics. Views are necessary for getting good performance out of the current strided -array libraries. It is not always clear however when a library will return a -view, and when it will return a copy. This API standard does not attempt to -specify this - libraries can do either. +array libraries. It is not always clear, however, when a library will return a +view and when it will return a copy. This standard does not attempt to +specify this—libraries may do either. -There are several types of operations that do in-place mutation of data -contained in arrays. These include: +There are several types of operations that may perform in-place mutation of +array data. These include: -1. Inplace operators (e.g. ``*=``) +1. In-place operators (e.g. ``*=``) 2. Item assignment (e.g. ``x[0] = 1``) 3. Slice assignment (e.g., ``x[:2, :] = 3``) 4. The `out=` keyword present in some strided array libraries (e.g. ``sin(x, out=y)``) -Libraries like TensorFlow and JAX tend to support inplace operators, provide +Libraries such as TensorFlow and JAX tend to support in-place operators by providing alternative syntax for item and slice assignment (e.g. an ``update_index`` -function or ``x.at[idx].set(y)``), and have no need for ``out=``. +function or ``x.at[idx].set(y)``) and have no need for ``out=``. -A potential solution could be to make views read-only, or use copy-on-write -semantics. Both are hard to implement and would present significant issues -for backwards compatibility for current strided array libraries. Read-only -views would also not be a full solution, given that mutating the original -(base) array will also result in ambiguous semantics. Hence this API standard -does not attempt to go down this route. +A potential solution could be to make views read-only or implement copy-on-write +semantics. Both are hard to implement and would present significant backward +compatibility issues for current strided array libraries. Read-only +views would also not be a full solution due to the fact that mutating the original +(base) array will also result in ambiguous semantics. Accordingly, this standard +does not attempt to pursue this solution. -Both inplace operators and item/slice assignment can be mapped onto +Both in-place operators and item/slice assignment can be mapped onto equivalent functional expressions (e.g. ``x[idx] = val`` maps to -``x.at[idx].set(val)``), and given that both inplace operators and item/slice +``x.at[idx].set(val)``), and, given that both in-place operators and item/slice assignment are very widely used in both library and end user code, this standard chooses to include them. -The situation with ``out=`` is slightly different - it's less heavily used, and -easier to avoid. It's also not an optimal API, because it mixes an +The situation with ``out=`` is slightly different—it's less heavily used, and +easier to avoid. It's also not an optimal API because it mixes an "efficiency of implementation" consideration ("you're allowed to do this -inplace") with the semantics of a function ("the output _must_ be placed into -this array). There are libraries that do some form of tracing or abstract -interpretation over a language that does not support mutation (to make -analysis easier); in those cases implementing ``out=`` with correct handling of -views may even be impossible to do. There's alternatives, for example the -donated arguments in JAX or working buffers in LAPACK, that allow the user to -express "you _may_ overwrite this data, do whatever is fastest". Given that -those alternatives aren't widely used in array libraries today, this API -standard chooses to (a) leave out ``out=``, and (b) not specify another method -of reusing arrays that are no longer needed as buffers. - -This leaves the problem of the initial example - with this API standard it -remains possible to write code that will not work the same for all array -libraries. This is something that the user must be careful about. +in-place") with the semantics of a function ("the output _must_ be placed into +this array"). There are libraries that do some form of tracing or abstract +interpretation over a vocabulary that does not support mutation (to make +analysis easier). In those cases implementing ``out=`` with correct handling of +views may even be impossible to do. + +There are alternatives. For example, the concept of donated arguments in JAX or +working buffers in LAPACK which allow the user to express "you _may_ overwrite +this data; do whatever is fastest". Given that those alternatives aren't widely +used in array libraries today, this standard chooses to (a) leave out ``out=``, +and (b) not specify another method of reusing arrays that are no longer needed +as buffers. + +This leaves the problem of the initial example—despite the best efforts of this +standard, it remains possible to write code that will not work the same for all +array libraries. This is something that the users are advised to best keep in +mind and to reason carefully about the potential ambiguity of implemented code. + +Copy keyword argument behavior +------------------------------ + +Several APIs in this standard support a ``copy`` keyword argument (e.g., +``asarray``, ``astype``, ``reshape``, and ``__dlpack__``). Typically, when a +user sets ``copy=True``, the user does so in order to ensure that they are free +to mutate the returned array without side-effects—namely, without mutating other +views on the original (base) array. Accordingly, when ``copy=True``, unless an +array library can guarantee that an array can be mutated without side-effects, +conforming libraries are recommended to always perform a physical copy of the +underlying array data. + +.. note:: + Typically, in order to provide such a guarantee, libraries must perform + whole-program analysis. + +Conversely, consumers of this standard should expect that, if they set +``copy=True``, they are free to use in-place operations on a returned array. From f6dc711114c333b726775b0fdbf58b7843b49be9 Mon Sep 17 00:00:00 2001 From: Athan Reines Date: Wed, 26 Feb 2025 02:38:20 -0800 Subject: [PATCH 2/2] chore: link to design topic --- spec/draft/design_topics/copies_views_and_mutation.rst | 3 +++ src/array_api_stubs/_draft/array_object.py | 2 +- src/array_api_stubs/_draft/creation_functions.py | 2 +- src/array_api_stubs/_draft/data_type_functions.py | 2 +- src/array_api_stubs/_draft/manipulation_functions.py | 2 +- 5 files changed, 7 insertions(+), 4 deletions(-) diff --git a/spec/draft/design_topics/copies_views_and_mutation.rst b/spec/draft/design_topics/copies_views_and_mutation.rst index 3bbeae165..f302d8c8e 100644 --- a/spec/draft/design_topics/copies_views_and_mutation.rst +++ b/spec/draft/design_topics/copies_views_and_mutation.rst @@ -79,6 +79,9 @@ standard, it remains possible to write code that will not work the same for all array libraries. This is something that the users are advised to best keep in mind and to reason carefully about the potential ambiguity of implemented code. + +.. _copy-keyword-argument: + Copy keyword argument behavior ------------------------------ diff --git a/src/array_api_stubs/_draft/array_object.py b/src/array_api_stubs/_draft/array_object.py index ba5f99851..ca1422bf9 100644 --- a/src/array_api_stubs/_draft/array_object.py +++ b/src/array_api_stubs/_draft/array_object.py @@ -370,7 +370,7 @@ def __dlpack__( API standard. copy: Optional[bool] boolean indicating whether or not to copy the input. If ``True``, the - function must always copy (performed by the producer). If ``False``, the + function must always copy (performed by the producer; see also :ref:`copy-keyword-argument`). If ``False``, the function must never copy, and raise a ``BufferError`` in case a copy is deemed necessary (e.g. if a cross-device data movement is requested, and it is not possible without a copy). If ``None``, the function must reuse diff --git a/src/array_api_stubs/_draft/creation_functions.py b/src/array_api_stubs/_draft/creation_functions.py index 6de79268e..c09800783 100644 --- a/src/array_api_stubs/_draft/creation_functions.py +++ b/src/array_api_stubs/_draft/creation_functions.py @@ -111,7 +111,7 @@ def asarray( device: Optional[device] device on which to place the created array. If ``device`` is ``None`` and ``obj`` is an array, the output array device must be inferred from ``obj``. Default: ``None``. copy: Optional[bool] - boolean indicating whether or not to copy the input. If ``True``, the function must always copy. If ``False``, the function must never copy for input which supports the buffer protocol and must raise a ``ValueError`` in case a copy would be necessary. If ``None``, the function must reuse existing memory buffer if possible and copy otherwise. Default: ``None``. + boolean indicating whether or not to copy the input. If ``True``, the function must always copy (see :ref:`copy-keyword-argument`). If ``False``, the function must never copy for input which supports the buffer protocol and must raise a ``ValueError`` in case a copy would be necessary. If ``None``, the function must reuse existing memory buffer if possible and copy otherwise. Default: ``None``. Returns ------- diff --git a/src/array_api_stubs/_draft/data_type_functions.py b/src/array_api_stubs/_draft/data_type_functions.py index db793c16e..d7ae3caee 100644 --- a/src/array_api_stubs/_draft/data_type_functions.py +++ b/src/array_api_stubs/_draft/data_type_functions.py @@ -43,7 +43,7 @@ def astype( dtype: dtype desired data type. copy: bool - specifies whether to copy an array when the specified ``dtype`` matches the data type of the input array ``x``. If ``True``, a newly allocated array must always be returned. If ``False`` and the specified ``dtype`` matches the data type of the input array, the input array must be returned; otherwise, a newly allocated array must be returned. Default: ``True``. + specifies whether to copy an array when the specified ``dtype`` matches the data type of the input array ``x``. If ``True``, a newly allocated array must always be returned (see :ref:`copy-keyword-argument`). If ``False`` and the specified ``dtype`` matches the data type of the input array, the input array must be returned; otherwise, a newly allocated array must be returned. Default: ``True``. device: Optional[device] device on which to place the returned array. If ``device`` is ``None``, the output array device must be inferred from ``x``. Default: ``None``. diff --git a/src/array_api_stubs/_draft/manipulation_functions.py b/src/array_api_stubs/_draft/manipulation_functions.py index 7e94cbc27..1fc178e20 100644 --- a/src/array_api_stubs/_draft/manipulation_functions.py +++ b/src/array_api_stubs/_draft/manipulation_functions.py @@ -230,7 +230,7 @@ def reshape( shape: Tuple[int, ...] a new shape compatible with the original shape. One shape dimension is allowed to be ``-1``. When a shape dimension is ``-1``, the corresponding output array shape dimension must be inferred from the length of the array and the remaining dimensions. copy: Optional[bool] - whether or not to copy the input array. If ``True``, the function must always copy. If ``False``, the function must never copy. If ``None``, the function must avoid copying, if possible, and may copy otherwise. Default: ``None``. + whether or not to copy the input array. If ``True``, the function must always copy (see :ref:`copy-keyword-argument`). If ``False``, the function must never copy. If ``None``, the function must avoid copying, if possible, and may copy otherwise. Default: ``None``. Returns -------