Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Python 3.10 and 3.11 #1937

Merged
merged 72 commits into from
Mar 19, 2024
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
ffac856
Add support for Python 3.10 and 3.11
simonzhaoms Jun 6, 2023
ffd8b9e
Correct upper bound version for category_encoders
simonzhaoms Jun 7, 2023
644aa5d
Add tests for Python 3.10 and 3.11
simonzhaoms Jun 7, 2023
9003450
Remove dependencies that others require
simonzhaoms Jun 7, 2023
a157403
Merge in staging
simonzhaoms Jun 9, 2023
793ec87
Update nvidia-ml-py and tensorflow version
simonzhaoms Jun 9, 2023
1556eb4
Install system level dependencies for scipy
simonzhaoms Jun 9, 2023
34aee1d
Merge in staging
simonzhaoms Jun 9, 2023
890a3fe
Support from Python 3.8 to 3.11
simonzhaoms Jun 9, 2023
c7bb846
Remove unused system deps
simonzhaoms Jun 9, 2023
7f2e298
Drop python 3.11 because some packages do not support 3.11
simonzhaoms Jun 9, 2023
09069f7
Install dependencies for scipy in docker image
simonzhaoms Jun 10, 2023
643fed6
Change docker image
simonzhaoms Jun 10, 2023
db4c9c3
Add pip==20.1.1
simonzhaoms Jun 10, 2023
9e05e4f
Correct conda package format
simonzhaoms Jun 10, 2023
e1d6acf
Remove pip downgrade code
simonzhaoms Jun 10, 2023
b71c4ed
Use docker images for ubuntu 22.04
simonzhaoms Jun 11, 2023
3b4a641
Merge in main
SimonYansenZhao Sep 1, 2023
3cb052a
Merge in staging
SimonYansenZhao Sep 2, 2023
d80002e
Replace pandas.util.testing with pandas.testing
SimonYansenZhao Sep 2, 2023
e084412
Remove nonexistent argument check_less_precise of assert_frame_equal()
SimonYansenZhao Sep 2, 2023
a605404
Remove tests for sarplus for Python 3.7
SimonYansenZhao Sep 4, 2023
40361f4
Fixed error: 'DataFrame' object has no attribute 'append'
SimonYansenZhao Sep 4, 2023
9364c9b
Add hypothesis<6.83.1
SimonYansenZhao Sep 4, 2023
be86a29
Use ubuntu-22.04 instead of latest
SimonYansenZhao Sep 4, 2023
0641d95
Update comments
SimonYansenZhao Sep 4, 2023
334ea1a
Merge remote-tracking branch 'origin/simonz-dep-upgrade-20230606' int…
SimonYansenZhao Sep 4, 2023
60e847a
Add python 3.11
SimonYansenZhao Sep 4, 2023
313de47
Add python 3.11
SimonYansenZhao Sep 4, 2023
91be6ae
Add python 3.11
SimonYansenZhao Sep 4, 2023
015ce4a
Add python 3.11
SimonYansenZhao Sep 4, 2023
76901c6
Add python 3.11
SimonYansenZhao Sep 4, 2023
22ac9e2
Add python 3.11
SimonYansenZhao Sep 4, 2023
d0074c5
Merge remote-tracking branch 'origin/simonz-dep-upgrade-20230606' int…
SimonYansenZhao Sep 4, 2023
c3a7030
Remove python 3.11
SimonYansenZhao Sep 4, 2023
3149ae7
Merge in staging
SimonYansenZhao Feb 22, 2024
15fbf90
Pin pip=20.1.1
SimonYansenZhao Feb 22, 2024
a7f8346
Update dep versions
SimonYansenZhao Feb 22, 2024
9f9c815
Fix pandas import
SimonYansenZhao Feb 22, 2024
2fdf590
Set scipy <1.11.0 and sort dependencies alphabetically
SimonYansenZhao Feb 22, 2024
b2fef7a
Fix error caused by changes in scikit-learn
SimonYansenZhao Feb 22, 2024
9a225d5
Replace CollabDataBunch with CollabDataLoaders
SimonYansenZhao Feb 23, 2024
5484d9b
Replace max_lr with lr_max
SimonYansenZhao Feb 23, 2024
6944404
Correct usage of load_learner in fastai
SimonYansenZhao Feb 23, 2024
0e69106
Replace learner.data.train_ds.x.classes.values() with learner.dls.cla…
SimonYansenZhao Feb 23, 2024
22ef9b9
Replace learner.data.train_ds.x.classes.values() with learner.dls.cla…
SimonYansenZhao Feb 23, 2024
a5fea78
Upgrade fastai code
SimonYansenZhao Feb 23, 2024
dccac17
Correct the usage of torch.column_stack()
SimonYansenZhao Feb 23, 2024
d3b0ad7
Correct conversion from tensor to numpy
SimonYansenZhao Feb 24, 2024
d249bfe
Remove duplicate dependencies jinja2 and packaging required other pac…
SimonYansenZhao Feb 24, 2024
547ab66
Try Python 3.11
SimonYansenZhao Feb 24, 2024
ed3b632
Allow Python 3.11 for sarplus
SimonYansenZhao Feb 24, 2024
c8d90f7
Rerun and fix fastai movielens notebook
miguelgfierro Feb 24, 2024
d9ec1cd
Fixed deprecated attribute in fastai
miguelgfierro Feb 24, 2024
d59f7ec
Merge pull request #2068 from recommenders-team/miguel/simon_deps
SimonYansenZhao Feb 25, 2024
fda5265
Fixing breaking changes in fastai
miguelgfierro Mar 4, 2024
ac90e54
Upgrade GitHub Action azure/login
SimonYansenZhao Mar 12, 2024
1d0fe7d
Update fastai usage in utils
SimonYansenZhao Mar 15, 2024
0740b16
change deprecated azureml option (#2069)
loomlike Mar 15, 2024
89cc985
Update SP creation doc
loomlike Mar 15, 2024
55433c5
:memo:
miguelgfierro Mar 16, 2024
730a5e9
Fixing TF to < 2.16
miguelgfierro Mar 18, 2024
657531a
:bug:
miguelgfierro Mar 18, 2024
e99b8d0
model to CUDA as well as data
miguelgfierro Mar 18, 2024
03554de
Set tensorflow <= 2.15.0
SimonYansenZhao Mar 19, 2024
47281c8
Add missing colon
SimonYansenZhao Mar 19, 2024
19bcf1a
Merge branch 'simonz-dep-upgrade-20230606' into miguel/fix_tf
miguelgfierro Mar 19, 2024
b255fae
:memo:
miguelgfierro Mar 19, 2024
85899cf
Reducing DKN batch size to 200
miguelgfierro Mar 19, 2024
d8e8ac3
Move learner.model to cuda if cuda is available
SimonYansenZhao Mar 19, 2024
21492c9
Merge branch 'simonz-dep-upgrade-20230606' into miguel/fix_tf
miguelgfierro Mar 19, 2024
14c5c93
Merge pull request #2071 from recommenders-team/miguel/fix_tf
miguelgfierro Mar 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/azureml-cpu-nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ jobs:
strategy:
max-parallel: 50 # Usage limits: https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration
matrix:
python-version: ['"python=3.7"', '"python=3.8"', '"python=3.9"']
python-version: ['"python=3.8"', '"python=3.9"', '"python=3.10"']
test-group: ${{ fromJSON(needs.get-test-groups.outputs.test_groups) }}
steps:
- name: Check out repository code
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/azureml-gpu-nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ jobs:
strategy:
max-parallel: 50 # Usage limits: https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration
matrix:
python-version: ['"python=3.7"', '"python=3.8"', '"python=3.9"']
python-version: ['"python=3.8"', '"python=3.9"', '"python=3.10"']
test-group: ${{ fromJSON(needs.get-test-groups.outputs.test_groups) }}
steps:
- name: Check out repository code
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/azureml-spark-nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ jobs:
strategy:
max-parallel: 50 # Usage limits: https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration
matrix:
python-version: ['"python=3.7"', '"python=3.8"', '"python=3.9"']
python-version: ['"python=3.8"', '"python=3.9"', '"python=3.10"']
test-group: ${{ fromJSON(needs.get-test-groups.outputs.test_groups) }}
steps:
- name: Check out repository code
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/azureml-unit-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ jobs:
strategy:
max-parallel: 50 # Usage limits: https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration
matrix:
python-version: ['"python=3.7"', '"python=3.8"', '"python=3.9"']
python-version: ['"python=3.8"', '"python=3.9"', '"python=3.10"']
test-group: ${{ fromJSON(needs.get-test-groups.outputs.test_groups) }}
steps:
- name: Check out repository code
Expand Down
74 changes: 33 additions & 41 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
import sys
import time

# workround for enabling editable user pip installs
# workaround for enabling editable user pip installs
site.ENABLE_USER_SITE = "--user" in sys.argv[1:]

# version
Expand All @@ -27,57 +27,48 @@
version += ".post" + str(int(time.time()))

install_requires = [
"numpy>=1.19", # 1.19 required by tensorflow 2.6
"pandas>1.0.3,<2",
"scipy>=1.0.0,<2",
"tqdm>=4.31.1,<5",
"matplotlib>=2.2.2,<4",
"scikit-learn>=0.22.1,<1.0.3",
"numba>=0.38.1,<1",
"lightfm>=1.15,<2",
"lightgbm>=2.2.1",
"memory_profiler>=0.54.0,<1",
"nltk>=3.4,<4",
"seaborn>=0.8.1,<1",
"transformers>=2.5.0,<5",
"category_encoders>=1.3.0,<2",
"jinja2>=2,<3.1",
"requests>=2.0.0,<3",
"cornac>=1.1.2,<1.15.2;python_version<='3.7'",
"cornac>=1.15.2,<2;python_version>='3.8'", # After 1.15.2, Cornac requires python 3.8
"retrying>=1.3.3",
"pandera[strategies]>=0.6.5", # For generating fake datasets
"scikit-surprise>=1.0.6",
"scrapbook>=0.5.0,<1.0.0",
"pandas>1.5.2,<2.1", # requires numpy
"scikit-learn>=1.1.3,<2", # requires scipy
"numba>=0.57.0,<1",
"lightfm>=1.17,<2",
"lightgbm>=3.3.2,<4",
"memory-profiler>=0.61.0,<1",
"nltk>=3.8.1,<4", # requires tqdm
"seaborn>=0.12.0,<1", # requires matplotlib
"transformers>=4.26.0,<5", # requires pyyaml, tqdm
"category-encoders>=2.6.0,<3",
"jinja2>=3.1.0,<3.2",
"cornac>=1.15.2,<2", # requires tqdm
"retrying>=1.3.4",
"pandera[strategies]>=0.15.0", # For generating fake datasets
"scikit-surprise>=1.1.3",
"scrapbook>=0.5.0,<1.0.0", # requires tqdm, papermill
]

# shared dependencies
extras_require = {
"examples": [
"hyperopt>=0.1.2,<1",
"ipykernel>=4.6.1,<7",
"jupyter>=1,<2",
"locust>=1,<2",
"papermill>=2.1.2,<3",
"hyperopt>=0.2.7,<1",
"notebook>=6.5.4,<8", # requires jupyter, ipykernel
"locust>=2.15.1,<3",
],
"gpu": [
"nvidia-ml-py3>=7.352.0",
# TensorFlow compiled with CUDA 11.2, cudnn 8.1
"tensorflow~=2.6.1;python_version=='3.6'",
"tensorflow~=2.7.0;python_version>='3.7'",
"nvidia-ml-py>=11.510.69",
# TensorFlow compiled with CUDA 11.8, cudnn 8.6.0.163
"tensorflow~=2.12.0",
"tf-slim>=1.1.0",
"torch>=1.8", # for CUDA 11 support
"fastai>=1.0.46,<2",
"torch>=2.0.1",
"fastai>=2.7.11,<3",
],
"spark": [
"pyarrow>=0.12.1,<7.0.0",
"pyspark>=2.4.5,<3.3.0",
"pyarrow>=10.0.1",
"pyspark>=3.0.1,<=3.4.0",
],
"dev": [
"black>=18.6b4,<21",
"pytest>=3.6.4",
"pytest-cov>=2.12.1",
"pytest-mock>=3.6.1", # for access to mock fixtures in pytest
"black>=23.3.0,<24",
"pytest>=7.2.1",
"pytest-cov>=4.1.0",
"pytest-mock>=3.10.0", # for access to mock fixtures in pytest
],
}
# for the brave of heart
Expand Down Expand Up @@ -123,6 +114,7 @@
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Operating System :: Microsoft :: Windows",
"Operating System :: POSIX :: Linux",
"Operating System :: MacOS",
Expand All @@ -132,7 +124,7 @@
"machine learning python spark gpu",
install_requires=install_requires,
package_dir={"recommenders": "recommenders"},
python_requires=">=3.6, <3.10",
python_requires=">=3.8, <3.11",
packages=find_packages(
where=".",
exclude=["contrib", "docs", "examples", "scenarios", "tests", "tools"],
Expand Down
35 changes: 21 additions & 14 deletions tests/ci/azureml_tests/submit_groupwise_azureml_pytest.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@
"""
import argparse
import logging
import glob

from azureml.core.authentication import AzureCliAuthentication
from azureml.core import Workspace
Expand Down Expand Up @@ -146,8 +145,7 @@ def setup_persistent_compute_target(workspace, cluster_name, vm_size, max_nodes)

def create_run_config(
cpu_cluster,
docker_proc_type,
workspace,
docker_image,
add_gpu_dependencies,
add_spark_dependencies,
conda_pkg_jdk,
Expand All @@ -166,8 +164,7 @@ def create_run_config(
the following:
- Reco_cpu_test
- Reco_gpu_test
docker_proc_type (str) : processor type, cpu or gpu
workspace : workspace reference
docker_image (str) : docker image for cpu or gpu
add_gpu_dependencies (bool) : True if gpu packages should be
added to the conda environment, else False
add_spark_dependencies (bool) : True if PySpark packages should be
Expand All @@ -181,7 +178,20 @@ def create_run_config(
run_azuremlcompute = RunConfiguration()
run_azuremlcompute.target = cpu_cluster
run_azuremlcompute.environment.docker.enabled = True
run_azuremlcompute.environment.docker.base_image = docker_proc_type
# See https://learn.microsoft.com/en-us/azure/machine-learning/how-to-train-with-custom-image?view=azureml-api-1#use-a-custom-dockerfile-optional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonzhaoms right now in the actions we are installing run: pip install --quiet "azureml-core>1,<2" "azure-cli>2,<3". What is the azurml sdk that we are installing? Here I see the latest one as 1.51 https://pypi.org/project/azureml-sdk/#history.

Of the 3 options, I think one that is interesting to explore would be Try install everything in the docker file without using Conda. iif we are reducing dependencies. I think 80% of our problems come from dependencies: #1936 So maybe something to reflect on is how can we reduce dependencies and use more standardize and robust software?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miguelgfierro

@simonzhaoms right now in the actions we are installing run: pip install --quiet "azureml-core>1,<2" "azure-cli>2,<3". What is the azurml sdk that we are installing? Here I see the latest one as 1.51 https://pypi.org/project/azureml-sdk/#history.

I think this azureml-core is only used for launching the script submit_groupwise_azureml_pytest.py.

https://github.com/microsoft/recommenders/blob/b71c4ed66991f8eddfd80a8afbff05b34591c7ee/.github/actions/azureml-test/action.yml#L75-L77

And the AzureML SDK I mentioned is used inside the docker image launched inside submit_groupwise_azureml_pytest.py

https://github.com/microsoft/recommenders/blob/b71c4ed66991f8eddfd80a8afbff05b34591c7ee/.github/actions/azureml-test/action.yml#L85-L94

https://github.com/microsoft/recommenders/blob/b71c4ed66991f8eddfd80a8afbff05b34591c7ee/tests/ci/azureml_tests/submit_groupwise_azureml_pytest.py#L178-L194

I think when we use CondaDependencies.add_pip_package("xxx"), AzureML adds the item xxx in a Conda env yaml file maintained by itself, and the AzureML SDK is also an item added by AzureML by default implicitly.

https://github.com/microsoft/recommenders/blob/b71c4ed66991f8eddfd80a8afbff05b34591c7ee/tests/ci/azureml_tests/submit_groupwise_azureml_pytest.py#L202-L223

What I don't know is why the AzureML SDK trigger the error now when I try to add support for Python 3.10 and upgrade all dependencies.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of the 3 options, I think one that is interesting to explore would be Try install everything in the docker file without using Conda. iif we are reducing dependencies. I think 80% of our problems come from dependencies: #1936 So maybe something to reflect on is how can we reduce dependencies and use more standardize and robust software?

I agree. In addition, if possible, I'd use what GitHub actions and workflows can provide to build the testing pipeline rather than use the AzureML service, because AzureML service is not transparent.

run_azuremlcompute.environment.docker.base_image = None
run_azuremlcompute.environment.docker.base_dockerfile = f"""
FROM {docker_image}
# Install system-level deps for scipy. See
# https://docs.scipy.org/doc/scipy/dev/contributor/building.html
RUN apt-get update && \
apt-get install -y \
gfortran \
libopenblas-dev \
liblapack-dev \
pkg-config
RUN apt-get install -y git
"""

# Use conda_dependencies.yml to create a conda environment in
# the Docker image for execution
Expand Down Expand Up @@ -425,13 +435,11 @@ def create_arg_parser():
args = create_arg_parser()

if args.dockerproc == "cpu":
from azureml.core.runconfig import DEFAULT_CPU_IMAGE

docker_proc_type = DEFAULT_CPU_IMAGE
# https://github.com/Azure/AzureML-Containers/blob/master/base/cpu/openmpi4.1.0-ubuntu22.04
docker_image = "mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu22.04"
else:
from azureml.core.runconfig import DEFAULT_GPU_IMAGE

docker_proc_type = DEFAULT_GPU_IMAGE
# https://github.com/Azure/AzureML-Containers/blob/master/base/gpu/openmpi4.1.0-cuda11.8-cudnn8-ubuntu22.04
docker_image = "mcr.microsoft.com/azureml/openmpi4.1.0-cuda11.8-cudnn8-ubuntu22.04"

cli_auth = AzureCliAuthentication()

Expand All @@ -452,8 +460,7 @@ def create_arg_parser():

run_config = create_run_config(
cpu_cluster=cpu_cluster,
docker_proc_type=docker_proc_type,
workspace=workspace,
docker_image=docker_image,
add_gpu_dependencies=args.add_gpu_dependencies,
add_spark_dependencies=args.add_spark_dependencies,
conda_pkg_jdk=args.conda_pkg_jdk,
Expand Down