You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
FluxSingleTransformerBlock parallelizes Attention and Feed Forward Network (FFN), and thus fuses the to_out linear layer in Attention Block and second linear layer in FFN into one proj_out layer. We split this proj_out layer back to two linear layers, one used in Attention Block and the other used in FC layer, following the Nunchaku implementation.
in this code
We can see only
proj_out
ofFluxSingleTransformerBlock
are converted to ConcatLinear with only a single split[module.proj_out.out_features]
.Can anyone help explain the reasoning behind this?
proj_out
ofFluxSingleTransformerBlock
. Should this operation be performed on all transformer based diffusion models?ConcatLinear
?Thank you in advance.
The text was updated successfully, but these errors were encountered: