v1 #270

jzhang38 · 2025-03-16T23:49:11Z

[ ] Attn backend (PY)
[ ] Wan text encoder (PY)
[ ] wan vae (Wei)
[ ] wan pipeline (PY & Wei)
[ ] Merge wan dit code to refactor (Will)
[ ] Clean up code & loader directory（Will)
[ ] hunyuan text encoder (Will)

Co-authored-by: Will Lin <[email protected]>

Co-authored-by: Peiyuan Zhang <[email protected]>

Signed-off-by: <> Co-authored-by: Will Lin <[email protected]> Co-authored-by: Ubuntu <ubuntu@awesome-gpu-name-8-inst-2tbsnfodvpomxv4tukw2dkfgyvz.c.nv-brev-20240723.internal> Co-authored-by: Ubuntu <ubuntu@awesome-gpu-name-9-inst-2tpydiudxfu1jg9xvpflm7oexie.c.nv-brev-20240723.internal>

…/FastVideo into rebased-refactor

Add wan dit

Co-authored-by: SolitaryThinker <[email protected]>

Co-authored-by: Peiyuan Zhang <[email protected]>

Co-authored-by: Peiyuan Zhang <[email protected]> Co-authored-by: Will Lin <[email protected]>

zhisbug · 2025-03-29T20:37:48Z

fastvideo/v1/layers/rotary_embedding.py

+        return torch.stack((o1, o2), dim=-1).flatten(-2)
+
+
+@CustomOp.register("rotary_embedding")


a little bit surprised that we need to customize for rotary embedding class?

zhisbug · 2025-03-29T20:40:00Z

fastvideo/v1/models/encoders/vision.py

+    uses_last_layer = feature_sample_layers[-1] in (len(hs_pool) - 1, -1)
+    if post_layer_norm is not None and uses_last_layer:
+        hs_pool[-1] = post_layer_norm(encoder_outputs)
+    return torch.cat(hs_pool, dim=-1)


space at EoF

zhisbug · 2025-03-29T21:15:24Z

fastvideo/v1/pipelines/hunyuan/hunyuan_pipeline.py

+from fastvideo.v1.inference_args import InferenceArgs
+from fastvideo.v1.logger import init_logger
+
+from ..composed_pipeline_base import ComposedPipelineBase


I saw a lot of relative imports throughout; can we avoid relative imports?

zhisbug · 2025-03-29T21:36:58Z

fastvideo/v1/pipelines/stages/decoding.py

+    def __init__(self, vae) -> None:
+        self.vae = vae
+
+    def forward(


one weird thing i noticed: you only use forward context in the text_encoding.py, not anywhere else? correct? if yes -- why?

we also use it for DiTs (set in the denoising stage).

zhisbug · 2025-03-29T21:47:41Z

fastvideo/v1/pipelines/stages/input_validation.py

input validation being a stage sounds strange, but i guess ok currently.

SolitaryThinker and others added 30 commits March 14, 2025 19:07

Initial set of common files and layers from vLLM (#226)

42f902c

add sp comm (#231)

ac07e43

Initial clip encoder and cli args organization (#232)

5252d50

[Refactor] Add Hunyuan DiT Modeling (#241)

0f4c8d1

Refactor py (#246)

fc5a4bc

v1 staging architecture

6e8b11c

Co-authored-by: Will Lin <[email protected]>

[Do not merge] V1 encoders and model loading (#261)

23fd3ed

Co-authored-by: Peiyuan Zhang <[email protected]>

DiT done and plub in pipeline (#252)

b2def4b

move refactor to fastvideo/v1 (#265)

b631546

remove unneeded file

bf1fc27

revert pyproject.toml

eafeea4

move v1's v0 code into v1/v0_reference_src

1976b23

remove unused attention

d5ac1e9

fix import paths

d2db0d4

Add wan dit

e976583

debugging encoders

fb44fba

fix attn

759f243

running, correctness isues

f51e9d4

add toggle flags for v0 pipeline components

796eaf8

vae update

691f9d1

update

83348c6

Merge branch 'rebased-refactor' of https://github.com/SolitaryThinker…

4e056b9

…/FastVideo into rebased-refactor

magic line

c37535a

update

ae5ed0c

Merge pull request #1 from SolitaryThinker/wei

6c52846

Add wan dit

model/loader.py -> component_loader.py

e14d384

moved loader/ into models/

bb6ae36

cleanup

8140cb2

cleanup

dd10588

SolitaryThinker and others added 11 commits March 25, 2025 22:29

revert predict.py

aa0ac8b

format 80 column

f8890c7

format config

42ad1e7

removed composed package in pipeline

5aeb825

remove xpu/hip

37ffba7

remove more code

ce56abb

PY refactor text encoding stages (#286)

5085006

Clean up pipeline (#287)

2fd05f5

Co-authored-by: SolitaryThinker <[email protected]>

cleanup code

8140ccb

license headers and cite external code

e092927

remove logging

baa5b0e

JerryZhou54 force-pushed the rebased-refactor branch from c076f8a to baa5b0e Compare March 27, 2025 04:50

JerryZhou54 and others added 12 commits March 27, 2025 12:23

Add Wan VAE & T5 Text Encoder (#291)

2cc570a

More v1 cleanup (#292)

5ee07e3

Rebased numerical (#293)

1a31234

Co-authored-by: Peiyuan Zhang <[email protected]>

Rebased refactor (#295)

6aee225

format (#296)

c836c6a

done with type checking - v1/layers/ (#297)

ac88ab6

Co-authored-by: Peiyuan Zhang <[email protected]> Co-authored-by: Will Lin <[email protected]>

Rebased refactor (#298)

024eb3e

format platforms (#300)

5461e56

done with type checking - v1/models/ (#299)

f52ac3c

Rebased refactor (#301)

8633119

cleanup

2ad54bc

don't use data/ for model_path

a6fadc7

zhisbug approved these changes Mar 29, 2025

View reviewed changes

SolitaryThinker added 2 commits March 29, 2025 15:35

revert relative imports

aaac4d1

format

de49a2d

SolitaryThinker marked this pull request as ready for review March 29, 2025 22:42

SolitaryThinker merged commit 1fee098 into main Mar 29, 2025
0 of 4 checks passed

SolitaryThinker changed the title ~~[do not merge] Rebased refactor~~ v1 Mar 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1 #270

v1 #270

jzhang38 commented Mar 16, 2025 •

edited

Loading

zhisbug Mar 29, 2025

zhisbug Mar 29, 2025

zhisbug Mar 29, 2025

zhisbug Mar 29, 2025

SolitaryThinker Mar 29, 2025

zhisbug Mar 29, 2025

		return torch.stack((o1, o2), dim=-1).flatten(-2)


		@CustomOp.register("rotary_embedding")

v1 #270

v1 #270

Conversation

jzhang38 commented Mar 16, 2025 • edited Loading

zhisbug Mar 29, 2025

Choose a reason for hiding this comment

zhisbug Mar 29, 2025

Choose a reason for hiding this comment

zhisbug Mar 29, 2025

Choose a reason for hiding this comment

zhisbug Mar 29, 2025

Choose a reason for hiding this comment

SolitaryThinker Mar 29, 2025

Choose a reason for hiding this comment

zhisbug Mar 29, 2025

Choose a reason for hiding this comment

jzhang38 commented Mar 16, 2025 •

edited

Loading