Name	Name	Last commit message	Last commit date
Latest commit kgryte Update library list Jun 29, 2020 efa4623 · Jun 29, 2020 History 40 Commits
data	data	Add JSON	Jun 29, 2020
docs	docs	Add support for ranking intersection APIs	Jun 18, 2020
etc	etc	Update library list	Jun 29, 2020
scripts	scripts	Add support for ranking intersection APIs	Jun 18, 2020
.editorconfig	.editorconfig	Update license info	Jun 15, 2020
.gitattributes	.gitattributes	Update license info	Jun 15, 2020
.gitignore	.gitignore	Update license info	Jun 15, 2020
.npmrc	.npmrc	Rename files and update build tools	Jun 18, 2020
LICENSE	LICENSE	Update license info	Jun 15, 2020
Makefile	Makefile	Update docs and add rule	Jun 18, 2020
README.md	README.md	Update list item	Jun 29, 2020
package.json	package.json	Add package.json for managing Node scripts	Jun 15, 2020

Repository files navigation

Array API Comparison

Data and tooling to compare the API surfaces of various array libraries.

Overview

The goal of this repository is to compare the public API surfaces of various PyData array libraries in order to better understand existing practice. In analyzing both the commonalities and differences across array libraries, we can derive a common API subset which can be standardized and used to ensure consistency (naming and otherwise) across array libraries. This API subset should include attribute names, method names, and positional and keyword arguments.

By deriving a common API subset, we can reduce friction among library consumers by reducing the cognitive overhead of learning array dialects. This is exemplified by the following user story:

As an array library author, I know that, regardless of the input array, whether NumPy, Dask, PyTorch, etc, the array has a method to compute the transpose which is guaranteed to have options x, y, and z.

Currently, the needs of the library author in the above user story are not met, as libraries vary in their naming conventions and the optional arguments they support.

Through specification and array library compliance, we facilitate array interoperability for both users and library developers.

Array Libraries

Currently, the following array libraries are evaluated:

NumPy: serves as the reference API against which all other array libraries are compared.
CuPy
Dask.array
JAX
MXNet
PyTorch
rnumpy: an opinionated curation of NumPy APIs, serving as an exercise in evaluating what is most "essential" (i.e., the smallest set of building block functionality on which most array functionality can be built).
PyData/Sparse
Tensorflow

Installation

Navigate to the directory into which you want to clone this repository

$ cd ./repository/destination/directory

Next, clone the repository

$ git clone https://github.com/pydata-apis/array-api-comparison.git

Once cloned, navigate to the repository directory

$ cd ./array-api-comparison

Create an Anaconda environment

$ conda create -n array-api-comparison -c conda-forge python=3.8 nodejs

To activate the environment,

$ conda activate array-api-comparison

Run the installation sequence

$ make

Usage

To view all array API tables in your local web browser,

$ make view-docs

To view cross-library array API data,

$ make view-join

To view the intersection of array library APIs,

$ make view-intersection

To view a table ranking the intersection of array library APIs,

$ make view-intersection-ranks

To view array library APIs which are not in the intersection,

$ make view-complement

Organization

This repository contains the following directories:

data: array API data (e.g., array library APIs and their NumPy equivalents).
docs: browser-based documentation for viewing array API data.
etc: configuration files.
scripts: scripts for data manipulation and documentation generation.

The data directory contains the following directories

raw: raw array library API data.
joins: array library APIs matched to their NumPy equivalents.

The raw data directory contains the following datasets:

XXXXX.(csv|json): raw array library API data.

The joins data directory contains the following datasets:

XXXXX_numpy.(csv|json): array library APIs and their NumPy equivalents.

Lastly, the root data directory contains the following additional datasets:

join.(csv|json): array library API data combined in a single file.
intersection.(csv|json): array library API intersection.

When editing data files, consider the JSON data to be the source of truth. CSV files are generated from the JSON data.

Contributing

To contribute array API data to this repository, add an data/joins/XXXXX_numpy.json file, where XXXXX is the lowercase name of the relevant array library (e.g., cupy). The JSON file should include a JSON array, where each array element has the following fields:

name: array library API name.
numpy: NumPy API equivalent.

For example,

[
    {
        "name": "all",
        "numpy": "numpy.all"
    },
    {
        "name": "allclose",
        "numpy": "numpy.allclose"
    },
    ...
]

Once added, the CSV variant can be generated using internal tooling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Array API Comparison

Overview

Array Libraries

Installation

Usage

Organization

Contributing

About

Releases

Packages

Contributors 3

Languages

License

data-apis/array-api-comparison

Folders and files

Latest commit

History

Repository files navigation

Array API Comparison

Overview

Array Libraries

Installation

Usage

Organization

Contributing

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages