Add TransformerEngine
test workflow
#4068
ci.yaml
on: pull_request
metadata
0s
Matrix: amd64 / test-te-h100 / transformer-engine-test-eks
Matrix: amd64 / test-te-unit-a100 / run-unit-test
Matrix: arm64 / test-te-h100 / transformer-engine-test-eks
Waiting for pending jobs
Matrix: arm64 / test-te-unit-a100 / run-unit-test
Waiting for pending jobs
amd64
/
...
/
launch-slurm-runner
8m 25s
arm64
/
...
/
launch-slurm-runner
Matrix: amd64 / test-te-multigpu-a100 / te-multi-gpu
Matrix: arm64 / test-te-multigpu-a100 / te-multi-gpu
Waiting for pending jobs
merge-new-manifest
0s
Matrix: publish-containers
Waiting for pending jobs
finalize
/
report
finalize
/
publish-badge
Annotations
7 errors
amd64 / test-te-h100 / transformer-engine-test-eks (multigpu, 4)
The job was canceled because "multigpu_2" failed.
|
amd64 / test-te-h100 / transformer-engine-test-eks (multigpu, 8)
The job was canceled because "multigpu_2" failed.
|
amd64 / test-te-unit-a100 / te-A100-unit-test
Process completed with exit code 1.
|
amd64 / test-te-h100 / transformer-engine-test-eks (unittest, 8)
The job was canceled because "multigpu_2" failed.
|
amd64 / test-te-multigpu-a100 / te-multi-gpu (8) / te-multi-gpu-8
The run was canceled by @aybchan.
|
amd64 / test-te-multigpu-a100 / te-multi-gpu (8) / te-multi-gpu-8
The operation was canceled.
|
amd64 / test-te-h100 / transformer-engine-test-eks (multigpu, 2)
Canceling since a higher priority waiting request for 'CI-alechan/add-te' exists
|