Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Andfoy support free threading #505

Merged
merged 32 commits into from
Mar 18, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
33ee71b
Test numexpr against pytest-run-parallel on 3.13t
andfoy Mar 3, 2025
8680084
Mock pytest in case is not available
andfoy Feb 26, 2025
8af34da
Build free-threaded wheels
andfoy Feb 26, 2025
706cb9d
Use CIBW_ENABLE
andfoy Feb 28, 2025
3181455
Use pytest for testing
andfoy Feb 28, 2025
61076a2
Update env variable value
andfoy Mar 3, 2025
1d15ad4
Move free-threaded builds to an indindependent job
andfoy Mar 4, 2025
40f04d2
Set free-threading variables only under free-threaded conditions
andfoy Mar 5, 2025
0fb95ec
Execute pytest with --pyargs
andfoy Mar 5, 2025
e75d15f
Add section in README regarding free-threading
andfoy Mar 5, 2025
5fe38b2
Fix the name of the wheels for uploading
FrancescAlted Mar 6, 2025
95cbaaa
Remove asterisks from wheel names and other improvements
FrancescAlted Mar 6, 2025
a15f943
Do not remove muslinux builds for now
FrancescAlted Mar 6, 2025
2f5bf50
Use cibw_id to remove * from wheel names
FrancescAlted Mar 6, 2025
68642a1
Be explicit on the build names
FrancescAlted Mar 6, 2025
de54ba2
include -> python
FrancescAlted Mar 6, 2025
16ab7d5
Yet another attempt for wheels
FrancescAlted Mar 6, 2025
18e9b89
Fixing arm64 arch
FrancescAlted Mar 6, 2025
22ac3f1
Fixing arm64 arch
FrancescAlted Mar 6, 2025
3ea863b
Don't use native arm64 builders for now
FrancescAlted Mar 6, 2025
bdbfd94
Skip tests on linux aarch64, not macosx arm64
FrancescAlted Mar 6, 2025
48e7b9a
Add a pre-commit config file
FrancescAlted Mar 6, 2025
04dfbeb
Remve types-all
FrancescAlted Mar 6, 2025
9dad87b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 6, 2025
ccade0b
Remove flake8 for now
FrancescAlted Mar 6, 2025
52b3799
Remove mypy checks for now
FrancescAlted Mar 6, 2025
bb2cffb
Mark numexpr interpreter as free-threaded safe
andfoy Mar 7, 2025
da8a9df
Ensure single thread write to gs.init_sentinels_done
andfoy Mar 12, 2025
9aab353
Address review comments
andfoy Mar 17, 2025
f4439c8
Register the thread_unsefe mark
FrancescAlted Mar 18, 2025
42baf82
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 18, 2025
0c42b7d
Revert commit 9aab353, as it makes some tests to crash
FrancescAlted Mar 18, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 47 additions & 22 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,37 +6,56 @@ permissions:
contents: read

env:
CIBW_BEFORE_BUILD: pip install setuptools oldest-supported-numpy
CIBW_BEFORE_BUILD: pip install setuptools oldest-supported-numpy pytest
CIBW_BEFORE_TEST: pip install pytest
CIBW_BUILD_VERBOSITY: 1
CIBW_TEST_COMMAND: python -c "import sys, numexpr; sys.exit(0 if numexpr.test().wasSuccessful() else 1)"
CIBW_TEST_SKIP: "*macosx*arm64*"
CIBW_TEST_COMMAND: pytest --pyargs numexpr
# Testing on aarch64 takes too long, as it is currently emulated on GitHub Actions
CIBW_TEST_SKIP: "*linux*aarch64*"
# Building for musllinux and aarch64 takes way too much time.
# Moreover, NumPy is not providing musllinux for x86_64 either, so it's not worth it.
CIBW_SKIP: "*musllinux*aarch64* *musllinux*x86_64*"

jobs:
build_wheels:
name: Build wheels on ${{ matrix.os }} for ${{ matrix.arch }} - ${{ matrix.p_ver }}
runs-on: ${{ matrix.os }}
name: Build wheels on ${{ matrix.os }} for ${{ matrix.arch }}
runs-on: ${{ matrix.runs-on || matrix.os }}
permissions:
contents: write
env:
CIBW_BUILD: ${{ matrix.cibw_build }}
CIBW_BUILD: ${{ matrix.cibw_pattern }}
CIBW_ARCHS_LINUX: ${{ matrix.arch }}
CIBW_ARCHS_MACOS: "x86_64 arm64"
CIBW_ENABLE: cpython-freethreading
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
arch: [x86_64, aarch64]
cibw_build: ["cp3{10,11,12,13}-*"]
p_ver: ["3.10-3.13"]
exclude:
- os: windows-latest
arch: aarch64
# cibuild is already in charge to build aarch64 (see CIBW_ARCHS_MACOS)
- os: macos-latest
include:
# Linux x86_64 builds
- os: ubuntu-latest
arch: x86_64
cibw_pattern: "cp3{10,11,12,13,13t}-manylinux*"
artifact_name: "linux-x86_64"

# Linux ARM64 builds (native runners)
- os: ubuntu-latest
arch: aarch64
cibw_pattern: "cp3{10,11,12,13,13t}-manylinux*"
artifact_name: "linux-aarch64"
# Don't use native runners for now (looks like wait times are too long)
#runs-on: ["ubuntu-latest", "arm64"]

# Windows builds
- os: windows-latest
arch: x86_64
cibw_pattern: "cp3{10,11,12,13,13t}-win*"
artifact_name: "windows-x86_64"

# macOS builds (universal2)
- os: macos-latest
arch: x86_64
cibw_pattern: "cp3{10,11,12,13,13t}-macosx*"
artifact_name: "macos-universal2"
steps:
- uses: actions/checkout@v3

Expand All @@ -45,17 +64,22 @@ jobs:
with:
python-version: '3.x'

- name: Install cibuildwheel
- name: Setup free-threading variables
if: ${{ endsWith(matrix.cibw_build, 't-*') }}
shell: bash -l {0}
run: |
python -m pip install cibuildwheel
echo "CIBW_BEFORE_BUILD=pip install setuptools numpy" >> "$GITHUB_ENV"
echo "CIBW_BEFORE_TEST=pip install pytest pytest-run-parallel" >> "$GITHUB_ENV"
echo "CIBW_TEST_COMMAND=pytest --parallel-threads=4 --pyargs numexpr" >> "$GITHUB_ENV"

- uses: docker/setup-qemu-action@v2
if: ${{ matrix.arch == 'aarch64' }}
name: Set up QEMU
- name: Set up QEMU
if: matrix.arch == 'aarch64'
uses: docker/setup-qemu-action@v3
with:
platforms: arm64

- name: Build wheels
run: |
python -m cibuildwheel --output-dir wheelhouse
uses: pypa/[email protected]

- name: Make sdist
if: ${{ matrix.os == 'windows-latest' }}
Expand All @@ -65,6 +89,7 @@ jobs:

- uses: actions/upload-artifact@v4
with:
name: ${{ matrix.artifact_name }}
path: ./wheelhouse/*

- name: Upload to GitHub Release
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ artifact/
numexpr.egg-info/
*.pyc
*.swp
*.so
*~
doc/_build
site.cfg
Expand Down
26 changes: 26 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: debug-statements

# Too many things to fix, let's just ignore it for now
#- repo: https://github.com/pycqa/flake8
# rev: 7.0.0
# hooks:
# - id: flake8
#
- repo: https://github.com/pycqa/isort
rev: 5.13.2
hooks:
- id: isort

# Too many things to fix, let's just ignore it for now
#- repo: https://github.com/pre-commit/mirrors-mypy
# rev: v1.8.0
# hooks:
# - id: mypy
# exclude: ^(docs/|setup.py)
2 changes: 1 addition & 1 deletion .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@ sphinx:
# https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- requirements: doc/requirements.txt
- requirements: doc/requirements.txt
2 changes: 1 addition & 1 deletion AUTHORS.txt
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Google Inc. contributed bug fixes.

David Cox improved readability of the Readme.

Robert A. McLeod contributed bug fixes and ported the documentation to
Robert A. McLeod contributed bug fixes and ported the documentation to
numexpr.readthedocs.io. He has served as the maintainer of the package
since 2016 to 2023.

Expand Down
18 changes: 18 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,24 @@ Usage
array([ True, False, False], dtype=bool)


Free-threading support
----------------------
Starting on CPython 3.13 onwards there is a new distribution that disables the
Global Interpreter Lock (GIL) altogether, thus increasing the performance yields
under multi-threaded conditions on a single interpreter, as opposed to having to use
multiprocessing.

Whilst numexpr has been demonstrated to work under free-threaded
CPython, considerations need to be taken when using numexpr native parallel
implementation vs using Python threads directly in order to prevent oversubscription,
we recommend either using the main CPython interpreter thread to spawn multiple C threads
using the parallel numexpr API, or spawning multiple CPython threads that do not use
the parallel API.

For more information about free-threaded CPython, we recommend visiting the following
`community Wiki <https://py-free-threading.github.io/>`


Documentation
-------------

Expand Down
2 changes: 2 additions & 0 deletions bench/boolean_timing.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,10 @@
####################################################################

from __future__ import print_function

import sys
import timeit

import numpy

array_size = 5_000_000
Expand Down
9 changes: 6 additions & 3 deletions bench/issue-36.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,14 @@
# performs better than the serial code. See issue #36 for details.

from __future__ import print_function

from time import time

import numpy as np
import numexpr as ne
from numpy.testing import assert_array_equal
from time import time

import numexpr as ne


def bench(N):
print("*** array length:", N)
Expand All @@ -31,4 +35,3 @@ def bench(N):
ne.set_num_threads(2)
for N in range(10, 20):
bench(2**N)

1 change: 1 addition & 0 deletions bench/issue-47.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import numpy

import numexpr

numexpr.set_num_threads(8)
Expand Down
6 changes: 4 additions & 2 deletions bench/large_array_vs_numpy.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,12 @@
import os

os.environ["NUMEXPR_NUM_THREADS"] = "16"
import threading
import timeit

import numpy as np

import numexpr as ne
import timeit
import threading

array_size = 10**8
num_runs = 10
Expand Down
7 changes: 4 additions & 3 deletions bench/multidim.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,12 @@
# Based on a script provided by Andrew Collette.

from __future__ import print_function

import time

import numpy as np

import numexpr as nx
import time

test_shapes = [
(100*100*100),
Expand Down Expand Up @@ -90,5 +93,3 @@ def test_func(a, b, c):
print("Simple: ", (stop1-start1)/nruns)
print("Numexpr: ", (stop2-start2)/nruns)
print("Chunked: ", (stop3-start3)/nruns)


4 changes: 3 additions & 1 deletion bench/poly.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,13 @@
#######################################################################

from __future__ import print_function

import sys
from time import time

import numpy as np
import numexpr as ne

import numexpr as ne

#expr = ".25*x**3 + .75*x**2 - 1.5*x - 2" # the polynomial to compute
expr = "((.25*x + .75)*x - 1.5)*x - 2" # a computer-friendly polynomial
Expand Down
5 changes: 4 additions & 1 deletion bench/timing.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,10 @@
####################################################################

from __future__ import print_function
import timeit, numpy

import timeit

import numpy

array_size = 5e6
iterations = 2
Expand Down
3 changes: 3 additions & 0 deletions bench/unaligned-simple.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,11 @@
"""

from __future__ import print_function

from timeit import Timer

import numpy as np

import numexpr as ne

niter = 10
Expand Down
3 changes: 3 additions & 0 deletions bench/varying-expr.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,12 @@
# the latency of numexpr when working with small arrays.

from __future__ import print_function

import sys
from time import time

import numpy as np

import numexpr as ne

N = 100
Expand Down
3 changes: 3 additions & 0 deletions bench/vml_timing.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,12 @@
####################################################################

from __future__ import print_function

import sys
import timeit

import numpy

import numexpr

array_size = 5_000_000
Expand Down
5 changes: 4 additions & 1 deletion bench/vml_timing2.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,14 @@
# https://github.com/pydata/numexpr/wiki/NumexprMKL

from __future__ import print_function

import datetime
import sys
from time import time

import numpy as np

import numexpr as ne
from time import time

N = int(2**26)

Expand Down
4 changes: 3 additions & 1 deletion bench/vml_timing3.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# -*- coding: utf-8 -*-
from timeit import default_timer as timer

import numpy as np

import numexpr as ne
from timeit import default_timer as timer

x = np.ones(100000)
scaler = -1J
Expand Down
12 changes: 6 additions & 6 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@ NumExpr API

.. automodule:: numexpr
:members: evaluate, re_evaluate, disassemble, NumExpr, get_vml_version, set_vml_accuracy_mode, set_vml_num_threads, set_num_threads, detect_number_of_cores, detect_number_of_threads

.. py:attribute:: ncores

The number of (virtual) cores detected.

.. py:attribute:: nthreads

The number of threads currently in-use.
Expand All @@ -18,11 +18,11 @@ NumExpr API

.. py:attribute:: version

The version of NumExpr.
The version of NumExpr.


Tests submodule
---------------

.. automodule:: numexpr.tests
:members: test, print_versions
:members: test, print_versions
1 change: 0 additions & 1 deletion doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,3 @@ Indices and tables
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

Loading