Skip to content

Add TransformerEngine test workflow #4061

Add TransformerEngine test workflow

Add TransformerEngine test workflow #4061

Triggered via pull request March 11, 2025 11:09
Status Cancelled
Total duration 16m 13s
Artifacts 3

ci.yaml

on: pull_request
metadata
0s
metadata
bump-manifest
14s
bump-manifest
Matrix: amd64 / test-te-h100 / transformer-engine-test-eks
Matrix: amd64 / test-te-unit-a100 / run-unit-test
Matrix: arm64 / test-te-h100 / transformer-engine-test-eks
Waiting for pending jobs
Matrix: arm64 / test-te-unit-a100 / run-unit-test
Waiting for pending jobs
amd64  /  ...  /  launch-slurm-runner
10m 35s
amd64 / test-te-unit-a100 / runner / launch-slurm-runner
arm64  /  ...  /  launch-slurm-runner
arm64 / test-te-unit-a100 / runner / launch-slurm-runner
Matrix: amd64 / test-te-multigpu-a100 / te-multi-gpu
Matrix: arm64 / test-te-multigpu-a100 / te-multi-gpu
Waiting for pending jobs
amd64  /  ...  /  sitrep
0s
amd64 / test-te-multigpu-a100 / sitrep
arm64  /  ...  /  sitrep
arm64 / test-te-multigpu-a100 / sitrep
make-publish-configs
0s
make-publish-configs
merge-new-manifest
0s
merge-new-manifest
Matrix: publish-containers
Waiting for pending jobs
finalize  /  workflow-badge
finalize / workflow-badge
finalize  /  report
finalize / report
finalize  /  upload-badge
finalize / upload-badge
finalize  /  publish-badge
finalize / publish-badge
Fit to window
Zoom out
Zoom in

Annotations

8 errors and 1 warning
amd64 / test-te-h100 / transformer-engine-test-eks (multigpu, 8)
Canceling since a higher priority waiting request for 'CI-alechan/add-te' exists
amd64 / test-te-h100 / transformer-engine-test-eks (unittest, 8)
Canceling since a higher priority waiting request for 'CI-alechan/add-te' exists
amd64 / test-te-unit-a100 / runner / launch-slurm-runner
Canceling since a higher priority waiting request for 'CI-alechan/add-te' exists
amd64 / test-te-unit-a100 / runner / launch-slurm-runner
The operation was canceled.
amd64 / test-te-multigpu-a100 / te-multi-gpu (8) / te-multi-gpu-8
Canceling since a higher priority waiting request for 'CI-alechan/add-te' exists
amd64 / test-te-multigpu-a100 / te-multi-gpu (8) / te-multi-gpu-8
The operation was canceled.
amd64 / test-te-unit-a100 / te-A100-unit-test
Canceling since a higher priority waiting request for 'CI-alechan/add-te' exists
amd64 / test-te-unit-a100 / te-A100-unit-test
The operation was canceled.
amd64 / test-te-unit-a100 / te-A100-unit-test
Runner A100-3db5499268f1 did not respond to a cancelation request with 00:05:00.

Artifacts

Produced during runtime
Name Size
artifact-multigpu-test-transformerengine-13786590548-2gpu-multigpu
573 Bytes
artifact-multigpu-test-transformerengine-13786590548-4gpu-multigpu
575 Bytes
bumped-manifest
46.6 KB