Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Update Wan Docs with memory optimizations #11089

Merged
merged 2 commits into from
Mar 28, 2025
Merged

[Docs] Update Wan Docs with memory optimizations #11089

merged 2 commits into from
Mar 28, 2025

Conversation

DN6
Copy link
Collaborator

@DN6 DN6 commented Mar 17, 2025

What does this PR do?

Based on feedback here
https://huggingface.slack.com/archives/C065E480NN9/p1742176300453069

Fixes # (issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@DN6 DN6 requested review from asomoza and a-r-r-o-w March 20, 2025 12:22
Copy link
Member

@a-r-r-o-w a-r-r-o-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

Copy link
Member

@asomoza asomoza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, looks great, just a a couple of comments that aren't blockers, just my opinion.


We will first need to install some addtional dependencies.

```shell
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should start telling the users what the additional dependencies are and a link to them so they feel more secure and understand what are they installing?

we can add just a link to the pypi page too: https://pypi.org/project/ftfy/

Also now that I see it, maybe this shouldn't be an required dependency but an optional one? I'll take a look later on how it's used.

@@ -65,6 +403,11 @@ transformer = WanTransformer3DModel.from_single_file(ckpt_path, torch_dtype=torc
pipe = WanPipeline.from_pretrained("Wan-AI/Wan2.1-T2V-1.3B-Diffusers", transformer=transformer)
```

## Recommendations for Inference:
- Keep `AutencoderKLWan` in `torch.float32` for better decoding quality.
- `num_frames` should be of the form `4 * k + 1`, for example `49` or `81`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can be more clear here at write that k is the frames per second or fps in a more common language?


#### Block Level Group Offloading

We can reduce our VRAM requirements by applying group offloading to the larger model components of the pipeline; the `WanTransformer3DModel` and `UMT5EncoderModel`. Group offloading will break up the individual modules of a model and offload/onload them onto your GPU as needed during inference. In this example, we'll apply `block_level` offloading, which will group the modules in a model into blocks of size `num_blocks_per_group` and offload/onload them to GPU. Moving to between CPU and GPU does add latency to the inference process. You can trade off between latency and memory savings by increasing or decreasing the `num_blocks_per_group`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we apply group offload on vae?

@DN6 DN6 merged commit 617c208 into main Mar 28, 2025
4 checks passed
@tin2tin
Copy link

tin2tin commented Mar 29, 2025

Thank you for this, super useful information. Have been struggling to get Wan i2v and Group Offloading working. I've tried many things to get Wan i2v to work, and properly bnb too. Are quantizations (w. ex. bitsandbytes) supposed to work on Wan too?

from diffusers import AutoencoderKLWan, WanTransformer3DModel, WanImageToVideoPipeline
from diffusers.hooks.group_offloading import apply_group_offloading
from diffusers.utils import export_to_video, load_image
from transformers import UMT5EncoderModel, CLIPVisionMode
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLIPVisionMode is missing CLIPVisionModel

"An astronaut hatching from an egg, on the surface of the moon, the darkness and depth of space realised in "
"the background. High quality, ultrarealistic detail and breath-taking movie-like camera shot."
)
negative_prompt = "Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is missing "

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants