Skip to content

[fbgemm_gpu] Update Nova jobs #3890

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions .github/scripts/nova_postscript.bash
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@ echo "[NOVA] Current working directory: $(pwd)"
# Record time for each step
start_time=$(date +%s)

echo "################################################################################"
echo "Environment Variables:"
printenv
echo "################################################################################"

# Collect PyTorch environment information
collect_pytorch_env_info "${BUILD_ENV_NAME}"
end_time=$(date +%s)
Expand All @@ -42,8 +47,13 @@ echo "[NOVA] Time taken to install wheel: ${runtime} seconds"
# Test with PyTest
echo "[NOVA] Current working directory: $(pwd)"
if [[ $CU_VERSION = cu* ]]; then
echo "[NOVA] Testing the CUDA variant of FBGEMM_GPU ..."
export fbgemm_variant="cuda"
if [[ ${BUILD_TARGET} == "genai" ]]; then
echo "[NOVA] Testing the GenAI variant of FBGEMM_GPU ..."
export fbgemm_variant="genai"
else
echo "[NOVA] Testing the CUDA variant of FBGEMM_GPU ..."
export fbgemm_variant="cuda"
fi

elif [[ $CU_VERSION = rocm* ]]; then
echo "[NOVA] Testing the ROCm variant of FBGEMM_GPU ..."
Expand Down
14 changes: 12 additions & 2 deletions .github/scripts/nova_prescript.bash
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,11 @@ BUILD_ENV_NAME=${CONDA_ENV}
# Record time for each step
start_time=$(date +%s)

echo "################################################################################"
echo "Environment Variables:"
printenv
echo "################################################################################"

# Display System Info
print_system_info
end_time=$(date +%s)
Expand Down Expand Up @@ -94,8 +99,13 @@ if [[ $CU_VERSION = cu* ]]; then
start_time=${end_time}
echo "[NOVA] Time taken to find NVML_LIB_PATH: ${runtime} seconds"

echo "[NOVA] Building the CUDA variant of FBGEMM_GPU ..."
export fbgemm_variant="cuda"
if [[ ${BUILD_TARGET} == "genai" ]]; then
echo "[NOVA] Building the GenAI variant of FBGEMM_GPU ..."
export fbgemm_variant="genai"
else
echo "[NOVA] Building the CUDA variant of FBGEMM_GPU ..."
export fbgemm_variant="cuda"
fi

elif [[ $CU_VERSION = rocm* ]]; then
echo "[NOVA] Building the ROCm variant of FBGEMM_GPU ..."
Expand Down
64 changes: 64 additions & 0 deletions .github/workflows/build_wheels_genai_linux_aarch64.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
name: Build FBGEMM GenAI Aarch64 Linux Wheels

on:
pull_request:
push:
branches:
- nightly
- main
tags:
# Release candidate tag look like: v1.11.0-rc1
- v[0-9]+.[0-9]+.[0-9]+-rc[0-9]+
- v[0-9]+.[0-9]+.[0-9]+
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

permissions:
id-token: write
contents: read

jobs:
generate-matrix:
if: ${{ github.repository_owner == 'pytorch' }}
uses: pytorch/test-infra/.github/workflows/generate_binary_build_matrix.yml@main
with:
package-type: wheel
os: linux-aarch64
test-infra-repository: pytorch/test-infra
test-infra-ref: main
with-cuda: disable

build:
if: ${{ github.repository_owner == 'pytorch' }}
needs: generate-matrix
strategy:
fail-fast: false
matrix:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: There is only one configuration here so you could consider get rid of the matrix and set these parameters directly like the x86 workflow

include:
- repository: pytorch/FBGEMM
smoke-test-script: ""
pre-script: ../.github/scripts/nova_prescript.bash
post-script: ../.github/scripts/nova_postscript.bash
env-var-script: .github/scripts/nova_dir.bash
package-name: fbgemm_gpu
name: ${{ matrix.repository }}
uses: pytorch/test-infra/.github/workflows/build_wheels_linux.yml@main
with:
repository: ${{ matrix.repository }}
ref: ""
test-infra-repository: pytorch/test-infra
test-infra-ref: main
build-matrix: ${{ needs.generate-matrix.outputs.matrix }}
pre-script: ${{ matrix.pre-script }}
post-script: ${{ matrix.post-script }}
package-name: ${{ matrix.package-name }}
build-target: genai
env-var-script: ${{ matrix.env-var-script }}
smoke-test-script: ${{ matrix.smoke-test-script }}
trigger-event: ${{ github.event_name }}
architecture: aarch64
setup-miniconda: false
timeout: 210
54 changes: 54 additions & 0 deletions .github/workflows/build_wheels_genai_linux_x86.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: Build FBGEMM GenAI x86 Linux Wheels

on:
pull_request:
push:
branches:
- nightly
- main
tags:
# Release candidate tag look like: v1.11.0-rc1
- v[0-9]+.[0-9]+.[0-9]+-rc[0-9]+
- v[0-9]+.[0-9]+.[0-9]+
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

permissions:
id-token: write
contents: read

jobs:
generate-matrix:
if: ${{ github.repository_owner == 'pytorch' }}
uses: pytorch/test-infra/.github/workflows/generate_binary_build_matrix.yml@main
with:
package-type: wheel
os: linux
test-infra-repository: pytorch/test-infra
test-infra-ref: main
with-cuda: enable
with-rocm: disable
with-cpu: disable

build:
if: ${{ github.repository_owner == 'pytorch' }}
needs: generate-matrix
name: pytorch/FBGEMM
uses: pytorch/test-infra/.github/workflows/build_wheels_linux.yml@main
with:
repository: pytorch/FBGEMM
ref: ""
pre-script: ../.github/scripts/nova_prescript.bash
post-script: ../.github/scripts/nova_postscript.bash
smoke-test-script: ""
env-var-script: .github/scripts/nova_dir.bash
package-name: fbgemm_gpu
build-target: genai
test-infra-repository: pytorch/test-infra
test-infra-ref: main
build-matrix: ${{ needs.generate-matrix.outputs.matrix }}
trigger-event: ${{ github.event_name }}
timeout: 240
5 changes: 2 additions & 3 deletions .github/workflows/build_wheels_linux_aarch64.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
name: Build Aarch64 Linux Wheels
name: Build FBGEMM_GPU Aarch64 Linux Wheels

on:
pull_request:
push:
branches:
- nightly
- main
# Release candidate branch look like: v1.11.0-release
- v[0-9]+.[0-9]+.[0-9]+-release+
tags:
# Release candidate tag look like: v1.11.0-rc1
- v[0-9]+.[0-9]+.[0-9]+-rc[0-9]+
Expand All @@ -32,6 +30,7 @@ jobs:
test-infra-repository: pytorch/test-infra
test-infra-ref: main
with-cuda: disable

build:
if: ${{ github.repository_owner == 'pytorch' }}
needs: generate-matrix
Expand Down
5 changes: 2 additions & 3 deletions .github/workflows/build_wheels_linux_x86.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
name: Build x86 Linux Wheels
name: Build FBGEMM_GPU x86 Linux Wheels

on:
pull_request:
push:
branches:
- nightly
- main
# Release candidate branch look like: v1.11.0-rc1
- v[0-9]+.[0-9]+.[0-9]+-release+
tags:
# Release candidate tag look like: v1.11.0-rc1
- v[0-9]+.[0-9]+.[0-9]+-rc[0-9]+
Expand All @@ -34,6 +32,7 @@ jobs:
with-cuda: enable
with-rocm: enable
with-cpu: enable

build:
if: ${{ github.repository_owner == 'pytorch' }}
needs: generate-matrix
Expand Down
8 changes: 6 additions & 2 deletions fbgemm_gpu/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,8 +116,12 @@ def package_name(self) -> str:
sys.exit(0)

elif self.nova_flag() == 0:
# The package name is the same for all build variants in Nova
pass
# In Nova, we are publishing genai packages separately from the main
# fbgemm_gpu package, so if the package variant is genai, we need to
# update the package name accordingly. Otherwise, the package name
# is the same for all other build variants in Nova
if self.args.package_variant == "genai":
pkg_name = "fbgemm_gpu_genai"

else:
# If running outside of Nova workflow context, append the channel
Expand Down
Loading