Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Some improvements to Nightly reports summaries #11166

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

DN6
Copy link
Collaborator

@DN6 DN6 commented Mar 28, 2025

What does this PR do?

We currently create summary reports for each test module, but the number of pipelines we test in the nightlys has grown considerably. Scrolling through them all is starting to get challenging. This PR introduces an additional step that consolidates the individual reports into a single report with some useful summary information.

A shorter summary report is also sent to the diffusers-ci Slack Channel with a link to the full report in Github Actions.

Example report below

# Diffusers Nightly Test Report
Generated on: 2025-03-28 12:04:31

## Summary
|:---------------|:---------|
| Total Tests    | 2429     |
| Passed         | 2121     |
| Failed         | 11       |
| Skipped        | 293      |
| Success Rate   | 87.32%   |
| Total Duration | 1768.28s |

## Test Suites
| Test Suite                                                  |   Tests |   Passed |   Failed |   Skipped | Success Rate   |   Duration (s) |
|:------------------------------------------------------------|--------:|---------:|---------:|----------:|:---------------|---------------:|
| torch_models_cuda/tests_torch_models_cuda                   |    2016 |     1729 |        6 |       277 | 85.76%         |         576.82 |
| torch_minimum_version_cuda/tests_torch_minimum_version_cuda |     250 |      235 |        3 |        12 | 94.00%         |         498.16 |
| pipeline_cogvideo/tests_pipeline_cogvideo_cuda              |     135 |      129 |        2 |         4 | 95.56%         |         332.4  |
| torch_cuda_gguf_reports/tests_gguf_torch_cuda               |      28 |       28 |        0 |         0 | 100.00%        |         360.9  |

## Slowest Tests
|   Rank | Test                                                                                                                     |   Duration (s) | Test Suite                                                  |
|-------:|:-------------------------------------------------------------------------------------------------------------------------|---------------:|:------------------------------------------------------------|
|      1 | tests/pipelines/test_pipelines_auto.py::AutoPipelineIntegrationTest::test_from_pipe_consistent                           |         156.63 | torch_minimum_version_cuda/tests_torch_minimum_version_cuda |
|      2 | tests/pipelines/cogvideo/test_cogvideox_image2video.py::CogVideoXImageToVideoPipelineIntegrationTests::test_cogvideox    |         139.11 | pipeline_cogvideo/tests_pipeline_cogvideo_cuda              |
|      3 | tests/pipelines/cogvideo/test_cogvideox.py::CogVideoXPipelineIntegrationTests::test_cogvideox                            |         106.64 | pipeline_cogvideo/tests_pipeline_cogvideo_cuda              |
|      4 | tests/quantization/gguf/test_gguf.py::SD35MediumGGUFSingleFileTests::test_pipeline_inference                             |          81.41 | torch_cuda_gguf_reports/tests_gguf_torch_cuda               |
|      5 | tests/pipelines/test_pipelines.py::PipelineNightlyTests::test_ddpm_ddim_equality_batched                                 |          80.82 | torch_minimum_version_cuda/tests_torch_minimum_version_cuda |
|      6 | tests/quantization/gguf/test_gguf.py::SD35LargeGGUFSingleFileTests::test_pipeline_inference                              |          76.58 | torch_cuda_gguf_reports/tests_gguf_torch_cuda               |
|      7 | tests/quantization/gguf/test_gguf.py::FluxGGUFSingleFileTests::test_pipeline_inference                                   |          54.15 | torch_cuda_gguf_reports/tests_gguf_torch_cuda               |
|      8 | tests/pipelines/test_pipelines.py::PipelineSlowTests::test_weighted_prompts_compel                                       |          40.98 | torch_minimum_version_cuda/tests_torch_minimum_version_cuda |
|      9 | tests/models/autoencoders/test_models_consistency_decoder_vae.py::ConsistencyDecoderVAEIntegrationTests::test_vae_tiling |          34.08 | torch_models_cuda/tests_torch_models_cuda                   |
|     10 | tests/pipelines/test_pipelines_auto.py::AutoPipelineIntegrationTest::test_controlnet                                     |          30.95 | torch_minimum_version_cuda/tests_torch_minimum_version_cuda |

## Failures
### AutoPipelineIntegrationTest
tests/pipelines/test_pipelines_auto.py::AutoPipelineIntegrationTest::test_from_pipe_consistent - ValueError: You are trying to load model files of the `variant=fp16`, but no such modeling files are available.
tests/pipelines/test_pipelines_auto.py::AutoPipelineIntegrationTest::test_pipe_auto - ValueError: You are trying to load model files of the `variant=fp16`, but no such modeling files are available.

### AutoencoderOobleckIntegrationTests
tests/models/autoencoders/test_models_autoencoder_oobleck.py::AutoencoderOobleckIntegrationTests::test_stable_diffusion_0 - ImportError: Numba needs NumPy 2.1 or less. Got NumPy 2.2.
tests/models/autoencoders/test_models_autoencoder_oobleck.py::AutoencoderOobleckIntegrationTests::test_stable_diffusion_1 - ImportError: Numba needs NumPy 2.1 or less. Got NumPy 2.2.
tests/models/autoencoders/test_models_autoencoder_oobleck.py::AutoencoderOobleckIntegrationTests::test_stable_diffusion_encode_decode_0 - ImportError: Numba needs NumPy 2.1 or less. Got NumPy 2.2.
tests/models/autoencoders/test_models_autoencoder_oobleck.py::AutoencoderOobleckIntegrationTests::test_stable_diffusion_encode_decode_1 - ImportError: Numba needs NumPy 2.1 or less. Got NumPy 2.2.
tests/models/autoencoders/test_models_autoencoder_oobleck.py::AutoencoderOobleckIntegrationTests::test_stable_diffusion_mode - ImportError: Numba needs NumPy 2.1 or less. Got NumPy 2.2.

Fixes # (issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@DN6 DN6 requested a review from sayakpaul March 28, 2025 12:13
Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good initiative. Can we see an example message on Slack?

Also, the benefit of the previous action was it pointed to the specific action run of which the failing tests are a part of. Are we doing that in this PR? If not, I emphasize on including that part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants