Skip to content

Add TransformerEngine test workflow #4068

Add TransformerEngine test workflow

Add TransformerEngine test workflow #4068

Re-run triggered March 12, 2025 10:21
Status Cancelled
Total duration 2m 7s
Artifacts 3

ci.yaml

on: pull_request
metadata
0s
metadata
bump-manifest
25s
bump-manifest
Matrix: amd64 / test-te-h100 / transformer-engine-test-eks
Matrix: amd64 / test-te-unit-a100 / run-unit-test
Matrix: arm64 / test-te-h100 / transformer-engine-test-eks
Waiting for pending jobs
Matrix: arm64 / test-te-unit-a100 / run-unit-test
Waiting for pending jobs
amd64  /  ...  /  launch-slurm-runner
8m 25s
amd64 / test-te-unit-a100 / runner / launch-slurm-runner
arm64  /  ...  /  launch-slurm-runner
arm64 / test-te-unit-a100 / runner / launch-slurm-runner
Matrix: amd64 / test-te-multigpu-a100 / te-multi-gpu
Matrix: arm64 / test-te-multigpu-a100 / te-multi-gpu
Waiting for pending jobs
amd64  /  ...  /  sitrep
0s
amd64 / test-te-multigpu-a100 / sitrep
arm64  /  ...  /  sitrep
arm64 / test-te-multigpu-a100 / sitrep
make-publish-configs
0s
make-publish-configs
merge-new-manifest
0s
merge-new-manifest
Matrix: publish-containers
Waiting for pending jobs
finalize  /  workflow-badge
finalize / workflow-badge
finalize  /  report
finalize / report
finalize  /  upload-badge
finalize / upload-badge
finalize  /  publish-badge
finalize / publish-badge
Fit to window
Zoom out
Zoom in

Annotations

7 errors
amd64 / test-te-h100 / transformer-engine-test-eks (multigpu, 4)
The job was canceled because "multigpu_2" failed.
amd64 / test-te-h100 / transformer-engine-test-eks (multigpu, 8)
The job was canceled because "multigpu_2" failed.
amd64 / test-te-unit-a100 / te-A100-unit-test
Process completed with exit code 1.
amd64 / test-te-h100 / transformer-engine-test-eks (unittest, 8)
The job was canceled because "multigpu_2" failed.
amd64 / test-te-multigpu-a100 / te-multi-gpu (8) / te-multi-gpu-8
The operation was canceled.
amd64 / test-te-h100 / transformer-engine-test-eks (multigpu, 2)
Canceling since a higher priority waiting request for 'CI-alechan/add-te' exists